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Biology from the bottom up 


Scientists have overturned the conventional approach to studying cells to instead build life’s 


systems from scratch. 


did — and have spent much of the time since urging each other 
not to reinvent it. This example illustrates a clear difference 
between two approaches to problem solving. Nature works with what 
it has from the bottom up, and eventually finds a solution through an 
inefficient process of trial and error. Nature has never explicitly asked 
itself: how can I move this bulk from here to there as quickly and easily as 
possible? Hence, no wheeled animals, although plenty of legs, wings and 
other ways of getting about. Humans tend to take the opposite approach: 
reduce, simplify and break down a complex problem to find the most 
efficient solution. Pee 
This human framing of a problem is often It is important 
described as top-down analysis, and that’s fe or researchers 


Pes: has famously never produced a wheel. Humans famously 


usually how research into cell biology pro- to focus on the 
ceeds. Even where the overall intention of benefits of 

the science is simply to expand knowledge such ambitious 
(compared with the specific task-focused projects, not just 
goal of engineering), the cellistoocom- the intellectual 


plex and sophisticated an object to analyse 
without being broken down conceptually. 

Top down involves a decomposition process. So although a 
researcher can make a career out of unpicking the workings of a 
cellular machine such as a ribosome or mitochondrion, the starting 
point for such projects has always been the role of these structures 
in existing cells. The work is directed by the context in which it 
originated and into which it will fold back once complete. 

Decomposition and working out from the top down how systems 
function is a valuable approach, but it might not be the best way to 
make a cellular process work better — or to produce a different one 
that does the same thing but more effectively. To do that, research- 
ers must be able to put aside the context, the system that evolution 
generated, and instead design and construct a system afresh from 
component parts, the so-called bottom-up approach. 

Take the very real challenge of finding a way to copy the natural 
process of photosynthesis — which could revolutionize energy pro- 
duction. As we discuss in a News Feature on page 172, one approach 
cell biologists are taking is to mix unusual combinations of enzymes — 
including some taken from bacteria and the human liver — to make 
different versions of metabolic pathways involved in photosynthesis 
and incorporate them into an artificial chloroplast. 

That research, and other work in a similar vein, is at the forefront of 
bottom-up biology. Biologists, physicists and chemists are attempting 
to reconstruct cellular processes by looking afresh at the constituent 
parts. In doing so, they argue, bottom-up science can extend the reach 


of researchers and 

perhaps offer some [ 

Hoel Ga ans BOTTOM-UP BIOLOGY 
A Nature special issue 
go.nature.com/hottomuphiology 


challenges.” 


solutions to long- 
standing problems. 


In a special issue this week, Nature brings together a series of 
articles that discuss and explore some of the challenges, opportuni- 
ties and complexity of this emerging field. At its most far-reaching, 
bottom-up biology could construct a reproducing artificial ‘cell’ 
completely from scratch. But it is important for researchers to focus 
on the benefits of such ambitious projects, not just the intellectual 
or practical challenges. A Comment piece on page 177 urges bot- 
tom-up biologists to set their sights on definite applications, such as 
artificial blood. 

Bottom-up biology is typically seen as different from ‘synthetic biol- 
ogy; which usually refers to an emerging branch of biotechnology that 
aims to assemble some highly derived (synthetic) products by bringing 
many separate parts together in complex sequences of elementary 
steps. In pursuing this goal, synthetic biology uses both top-down and 
bottom-up approaches. 

The creation of living systems according to human design throws up 
some powerful questions — not least who gets given the responsibility 
to do it and how the work and what results from it can be controlled 
and regulated. So it’s important that scientists, policymakers and the 
public are kept informed and consulted about where this research 
could lead. = 


Launch sequence 


Life on Earth is to have its DNA analysed ina 
welcome conservation effort. 


decline in biodiversity by sampling and decoding the DNA 

of every species of plant and animal on Earth. Called the 
Earth BioGenome Project, the effort is seeking funding to help it 
get off the ground. It is asking for US$4.7 billion to sequence all 
1.35 million known eukaryotic species — those with a cell nucleus 
enclosed by a membrane — over the next 10 years. 

Given the colossal scale of the crisis that faces life on the planet, 
genomics might seem an unlikely saviour. Biology has certainly 
advanced to a different realm since physicist Ernest Rutherford’s 
famous quip that science was either physics or stamp collecting. But 
how much — really — can reading the DNA sequences of species 
save the organisms from the threat of climate change, the destruc- 
tion of their habitats or human over-exploitation of natural resources 
through fishing and farming? To someone with a hammer, every 
problem looks like a nail. Are scientists with DNA-sequencing 
machines falling for the same logical fallacy? Is this a project that 
is being done because technology means that it now can, rather 


A n ambitious project launched last week aims to slow the 
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than because the need for conservation says that it must? 

The organizers have yet to make their case fully — after all, the 
project is still on the drawing board — but the early signs suggest 
that it is worthwhile. Yes, it is likely to be relatively expensive to 
accomplish fully, but so is much of modern science on a grand scale. 
In today’s money, the Human Genome Project cost $5 billion, and 
few people would argue that this was not money well spent. The 
construction of the Large Hadron Collider, which discovered the 
Higgs boson, cost about the same amount. (And as Harris Lewin, 
the organizer of the London launch of the Earth BioGenome 
Project, provocatively asked: “What has the Higgs boson done for 
you lately?”) 

What can genomics do for conservation? Quite a lot, actually, and 
the vast scope of the project can easily obscure the intensely local 
insights that might emerge. To point to one small example reported 
this year, an analysis of 3,095 DNA variations called single nucleo- 
tide polymorphisms in the genome of the endangered eastern tiger 
salamander (Ambystoma tigrinum) in Long Island, New York, found 
that, because roads were restricting the animals’ movement between 
breeding ponds, genetic fragmentation of populations was occur- 
ring (E. McCartney-Melsad et al. Preprint at bioRxiv http://doi.org/ 
gdcd5x; 2018). The finding highlighted the need for conservation 
efforts to focus on mitigating this development. 

But so far, scientists have just scratched the surface in terms of the 
diversity of organisms sequenced. And sequencing technologies are 
only now mature enough to generate high-quality (complete) genomes 
for in-depth studies. Of the 33,000 genomes in the archives of the 
US National Center for Biotechnology Information (which represent 
0.2% of eukaryotic species diversity), only 50% are of high quality. 

Arguably, the highest-quality (and the most expensive) genomes 
are not strictly necessary for conservation efforts to benefit, but 
they might reveal the route to new biofuels, drug leads and useful 
agricultural traits. Finding such applications, and so presenting the 
conservation of biodiversity as a boon to national economies, local 
cultures and the environment, should further help governments to 
take biodiversity issues even more seriously. 

Certainly, the need is urgent and the statistics alarming: 50% of 


current biodiversity could be lost by the end of the century. Earth’s 
sixth great extinction event is firmly under way, and ending this crisis 
will take much more than DNA sequences. But the Earth BioGenome 

Project can play a part, and early signs are that it might work. 
It is right to seek commitment from participants, by asking them 
to chip in with money from their own grants. And a good sign is that 
it’s not a top-down monolith. Unlike a typi- 


“Earth’s sixth cal genome-sequencing project, it has come 
great extinction together asa grass-roots initiative, driven 
event is firmly by individuals who study diverse groups 


of organisms and who are already working 
to sequence the organisms’ DNA. The new 
project includes ongoing efforts such as i5K (insects), B10K (birds) 
and the Darwin Tree of Life project, which aims to sequence all of 
the estimated 66,000 eukaryotic species in the United Kingdom. That 
suggests the pay-off could come more quickly because many of the 
genomes are already targeted by research communities keen to process 
and annotate them. 

One looming issue is how easy it will be to transfer samples and 
genetic data across national borders. A meeting of the United Nations 
Convention on Biological Diversity (CBD) in Egypt later this month 
will consider new controls on the sharing of digital genetic data. The 
proposals would extend the reach of the 2014 Nagoya Protocol, which 
provides for equitable sharing of the benefits obtained from using 
genetic resources. If properly implemented, such rules will create 
greater legal certainty and transparency for the countries that provide 
such resources and the scientists and companies that use them. They 
will also help to boost local scientific capacity in the many poorest 
countries that hold some of the world’s richest biodiversity. 

Extending the protocol to cover genetic data makes sense, but, ifdone 
clumsily, it could create a mess. The CBD has to its credit held extensive 
consultations with scientists and research institutions likely to be affected. 
The Earth BioGenome Project could help, by speaking as one voice for 
researchers. It’s better to have one international effort to negotiate solu- 
tions for data sharing, instead of a hotchpotch of complex individual 
and bilateral agreements. And that will help to ensure that the Earth 
BioGenome Project really does benefit the entire Earth. m 


under way.” 


Note worthy 


The Bank of England should put a female 
scientist on its next £50 note. 


hat does Marie Curie have in common with the 
WG reciestais Hideyo Noguchi and the theoretical 
astrophysicist Victor Ambartsumian? They are among 
the scientists who have featured on banknotes around the world 
(respectively, the old 20,000 Polish zloty, the ¥1,000 in Japan and the 
100 Armenian dram). Now, the British public has the chance to choose 
who should join them. Last week, the Bank of England announced that 
it is looking for an inspirational scientist to appear on the next £50 
note. It has invited suggestions and will pass them to a dedicated com- 
mittee, which will make the final decision and announce it next year. 
Scientists and engineers have featured heavily on UK banknotes 
since the bank started to print historical figures on their reverse sides 
in 1970. Generations of Britons have been paid with notes depicting 
Isaac Newton, George Stephenson, Michael Faraday and Florence 
Nightingale. The designs have not always pleased everyone. The 
£10 note released in 2000 featured Charles Darwin and his trip on 
HMS Beagle, but also threw in some hummingbirds — which many 
biologists felt were irrelevant. 
Whoever is chosen (the only binding criteria are that they must be 
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British and dead) will replace the steam-engine pioneers Matthew 
Boulton and James Watt on the £50 note, the highest denomination 
in circulation. It has yet to feature a woman, and this has led to sug- 
gestions that the Bank of England should choose a female scientist. 
Nature agrees. It’s true that this would rule out deserving figures 
such as Alan Turing and Stephen Hawking (who died this year and 
who bank officials have said would be allowed, even though the 
bank usually expects banknote candidates to have been dead for at 
least 20 years). But here is an opportunity to celebrate the hugely 
important achievements of a woman in science, and to offer an 
important and inspiring role model at the same time. 

A straw poll of some Nature staff highlighted some clear possi- 
bilities, none of whom will come as a particular surprise to readers. 
Mary Anning (1799-1847) was a prolific fossil hunter who changed 
the way we think about the history of life. Ada Lovelace (1815-1852) 
is credited with producing the first account of a prototype computer 
and its possible applications. Rosalind Franklin (1920-1958) was an 
X-ray crystallographer who played a key part in work to establish the 
structure of DNA. And Dorothy Hodgkin (1910-1994) remains the 
only British woman to win a science Nobel prize, for her research to 
unravel the structures of proteins including insulin. 

We intend to determine and submit our choice before the 
14 December deadline. We welcome the recommendations of readers 
everywhere as to who they would choose (e-mail: briefing@nature. 
com). And we encourage you to submit your own nominations at 
go.nature.com/2jrkt4y. The launch date of the note itself has not yet 
been confirmed, but it will not appear in circulation before 2020. m 


Limited. All rights reserved. 


DANIEL PULLEN 


WORLD VIEW jennisicos son 


‘ 


While collecting data for my PhD off the coast of Costa Rica, 
my team decided to remove what looked to be a barnacle 
encrusted in the nostril of the turtle, which we had captured for a 
research study. The object turned out to be a 10-centimetre section 
of a disposable plastic drinking straw. We filmed the process. That 
upsetting video (see go.nature.com/2qfci6f) has now had more than 
33 million views, and became an emblem of the anti-straw movement. 
It also thrust me into a world of high-profile advocacy I never 
expected to enter. I became involved in a documentary project, and 
community activists who were launching plastic-free campaigns asked 
for my support; I’ve gone to schools, conferences and screenings to 
talk about a subject that is not my main research 
focus. Last month, to my surprise, Time named 
me a 2018 Next Generation Leader, alongside 
celebrities such as Ariana Grande and Hasan 
Minhaj. All this has taught me that communi- 
cating beyond academia is worth trying, but it 
demands constant vigilance and caution. 

Ialways have to remind non-scientists that my 
video is, of course, not the first documentation 
of how plastic harms marine wildlife. A legion of 
scientific articles does exactly that. But, for many, 
it takes videos such as mine to make these articles 
less abstract. Id spent years making videos that I 
hoped would encourage conservation by show- 
ing the beauty of nature. They had little effect 
compared with my video ofa bleeding turtle an 
a spontaneous anti-straw tirade. 

Many scientists shy away from the press — 
or from uploading videos that show emotion, 
especially anger and frustration. We fear the sim- 
plification and inaccuracies likely to be introduced into accounts of our 
research, which could cause us to lose credibility with peers and funders. 
Yet, these routes might be the most effective way of getting information 
to policymakers and citizens, and so promoting conservation. 

This year, companies including Alaska Airlines, Disney and 
Starbucks announced programmes to phase out plastic straws. Seattle, 
Washington, and San Francisco, California, among other cities, have 
moved to ban or limit them. Of course, straws are just a tiny fraction 
of the plastics in the ocean. (Roughly, they make up less than 0.03% of 
the more than 8 million tonnes of plastic waste, largely consumer trash 
and fishing nets, that makes its way to the ocean every year, mainly 
from middle-income countries.) 

I take care to explain that the straw is emblematic of unnecessary 
plastic items and how human activity harms oceans. The message is 
getting through. Last week, the European Union moved to prohibit 
several single-use plastics, including plates and cutlery. 

Activists need scientists input. When you're trying to preserve 
species effectively and have limited funds, you need to know which 


sk years ago, I uploaded a video of a sea turtle in distress. 


1AM SCARED 


THAT IF | TURN DOWN 
CHANCES TO 


SPREAD THE 
MESSAGE, 


I'M LETTING DOWN 
THE CREATURES I'D 


HOPED T0 HELP. 


~ What I learnt pulling a 
. straw out of a turtle’s nose 


When my video went viral, I found that communicating to non-scientists is 
uncomfortable — and effective, says Christine Figgener. 


life stages have the highest chance of survival and whether there is 
enough suitable habitat left for a species to even sustain larger num- 
bers. Sometimes people are eager to undertake intense hands-on work 
(such as rescuing turtle eggs by digging them up and reburying them) 
even when less-dramatic efforts (such as establishing protected beach 
areas) would be sufficient and longer lasting. 

Delivering compelling messages is difficult. 1 am used to obsessing 
over my data, not over how I look on camera. My research is dirty and 
smelly, full of long hours and unkempt hair. Conservation campaigns 
focus more on appearances, marketing and selling. 

Thanks to my video, I have acquired a thicker skin and an eclectic 
set of skills ranging from copyright law, social-media marketing and 
unconventional ways of fundraising (I started a 
GoFundMe page for research). I learnt to ignore 
most rude and ignorant remarks: for instance, 
claims that I shoved the straw into the turtle’s 
nose for self-promotion. If I respond, I draft an 
unemotional e-mail debunking accusations point 
by point with established facts. 

What rankles more is when people try to take 
advantage of me. As in academia, philanthropy 
and advocacy are full of big egos that sometimes 
care more about advancing themselves than a 
cause. They are also less likely to buy into an 
ideal of citing and crediting others. I have learnt 
to be careful about how others use my work. 

It might seem to other early-career scientists 
that I won the lottery by publishing a gruesome 
video rather than hundreds of scientific articles, 
but Iam not even sure whether my modicum 
of celebrity makes me more or less employable. 
My advocacy has taken time away from my 
research. I still need to finish my dissertation on the migration pat- 
terns of olive ridley sea turtles (Lepidochelys olivacea). Yet lam scared 
that if turn down speaking engagements or other chances to spread 
the message about plastics pollution, I’m letting down the creatures 
Id hoped to help by studying them. 

Although it might never feel entirely comfortable, I intend to keep 
straddling both academia and advocacy. After the straw-extraction 
video went viral, my colleague and I decided that we needed a conven- 
tionally citable publication, and so we wrote a piece that appeared in 
Marine Turtle Newsletter (N. J. Robinson and C. Figgener Mar. Turtle 
Newsl. 147, 5-6; 2015). That article exemplifies why doing outreach 
beyond academia is so important. Maybe a few hundred scientists read 
the peer-reviewed article, whereas millions of people saw the video. 
Which had the bigger impact? = 


Christine Figgener is a PhD student in the ID Marine Biology 
Program at Texas A&M University in College Station. 
e-mail: christine.figgener@tamu.edu 


8 NOVEMBER 2018 | VOL 563 | NATURE | 157 


© 2018 Springer Nature Limited. All rights reserved. 


SEVEN DAYS 


EVENTS 


Ebola vaccine 

On 2 November, the 

Uganda Ministry of Health 
announced that it would 

give the experimental Ebola 
vaccine, rVSV-ZEBOV, to 
around 3,000 people at high 
risk of infection near the 
country’s border with the 
Democratic Republic of the 
Congo (DRC). Uganda has 
not yet confirmed any Ebola 
cases, but the outbreak in the 
DRCis spreading and the 
World Health Organization 
(WHO) says there’ a high risk 
of it reaching neighbouring 
countries. As of 3 November, 
the DRC had reported 

298 cases and 186 deaths, 
including one in a previously 
unaffected part of the country. 
The number of infections has 
increased by roughly 50% over 
the past 4 weeks, according to 
the WHO. In the past week, the 
neighbouring Republic of the 
Congo, South Sudan, Uganda 
and Yemen have reported 
suspected Ebola cases. 


Grant-funding test 


Australia’s government is 

set to introduce a ‘national- 
interest test’ for research 
projects seeking grant funding 
from next year. The policy 
will require researchers to 
outline how their project will 
advance the country’s interests, 
said education minister 

Dan Tehan ina statement 

on 31 October. The test will 
apply to applications seeking 
money from the Australian 
Research Council (ARC), a 
major funder of science and 
humanities research. Research 
groups and academics have 
criticized the decision, saying 
that grant assessments already 
demand a description of a 
project’s potential benefits. 
The policy announcement 
came days after news emerged 
that the previous education 


The news in brief 


Bold genome project kicks off 


An ambitious effort to sequence the genome 

of every complex organism on Earth was 
launched on 1 November in London. The Earth 
BioGenome Project aims to decode the genomes 
of the roughly 1.5 million known animal 
(pictured, composite image of platypus), plant, 
protozoan and fungal species — the eukaryotes 
— over the next decade, at an estimated cost of 
US$4.7 billion. As part of the effort, scientists 


minister, Simon Birmingham, 
used his authority to reject 

11 grant applications that had 
been recommended for ARC 
funding by independent peer- 
review panels in 2017 and 2018. 


Boost for Plan$S 


Two of the world’s largest 
biomedical research funders 
have backed a plan to make 

all papers resulting from work 
they fund open-access (OA) 

on publication by 2020. On 

5 November, the London-based 
Wellcome Trust and the Bill 
and Melinda Gates Foundation 
in Seattle, Washington, 
endorsed ‘Plan S; adding their 
weight to the initiative, which is 
already backed by 13 research 
funders from across Europe. 
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Plan S was launched in 
September and is spearheaded 
by Robert-Jan Smits, the 
European Commission's 
special envoy on OA. The 
Wellcome Trust, which gave 
out £1.1 billion (US$1.4 billion) 
in grants in 2016-17, is also 
the first funder to detail how it 
intends to implement Plan S. It 
already has an OA policy, but 
it allows an embargo of up to 
six months after publication 
before papers have to be 

made free to read. It now says 
that by 2020, it will ban such 
embargoes. Wellcome-funded 
work will not be able to appear 
in subscription journals unless 
these publications permit 
papers to be published under 
OA terms. Wellcome will also 


at the Wellcome Sanger Institute in Hinxton, 
UK, have committed up to £50 million 

(US$65 million) over 8 years to sequence the 
genomes of the eukaryotic species in the United 
Kingdom, thought to number about 66,000 — 
among the largest commitments to the effort 
yet made. At the London meeting, participants 
thrashed out guidelines for sample collection, 
sequencing and data curation and sharing. 


stop providing OA funds for 
‘hybrid’ journals, but it will 
not bar publication in them 
if scientists can pay for it 
themselves. 


Gender protest 
More than 1,600 scientists in 
the United States sent a letter to 
the US Department of Health 
and Human Services (HHS) 
on 2 November, condemning 
the department's recently 
revealed proposal to define 
whether a person is male or 
female by their genitals at 
birth. The presidents of three 
biological societies, which 
together represent more 

than 3,000 scientists, sent a 
similar letter to the HHS last 
week. The researchers reject 
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the government's argument 
that its proposal is based on 
science. They say that the 
science linking genetics, 
anatomy and gender identity is 
complex and still unclear, and 
that implementing the policy 
would erase protections for and 
identities of millions of people 
who do not identify with 

their sex assigned at birth — 
including those who are 
transgender or intersex. The 
proposed change would apply 
to Title IX, a US civil-rights 
law that bans discrimination 
on the basis of sex in education 
programmes that receive 
government funds. 


FACILITIES 


Lost in space 

NASA officials pronounced 
two of the agency’s long- 
running space missions 

dead after the spacecraft ran 
out of fuel last month. On 

30 October, NASA announced 
that its Kepler space telescope 
(pictured) had ceased science 
operations. Since its 2009 
launch, Kepler has discovered 
more than 2,680 confirmed 
planets beyond the Solar 
System, including Kepler-186f, 
an Earth-sized planet in the 
habitable zone. The telescope 
also spotted thousands of 
other potential worlds that 

are awaiting confirmation. 
And on 31 October, when the 


TREND WATCH | 


Unpaid research stints are 


Dawn spacecraft failed to make 
contact with Earth from its 
orbit around the asteroid Ceres, 
mission managers concluded 
that it, too, had stopped 
functioning. Launched in 2007, 
Dawn visited two big asteroids 
— Vesta in 2011 and Ceres in 
2015 — the first probe to orbit 
two objects beyond Earth. 


India’s neutrino lab 


Long-delayed efforts to 

build a major neutrino 
observatory in India cleared a 
legal hurdle on 2 November, 
when the country’s National 
Green Tribunal upheld 
environmental clearance for 
the project. The clearance had 
been challenged by activists, 
who say that excavation 

will affect wildlife and 
resources. The 15-billion- 
rupee (US$206-million) 
facility, called the India-based 
Neutrino Observatory, is set to 
be built under 1.2 kilometres 
of rock in the southern state 
of Tamil Nadu. Physicists 


Cameroon, which posted 


common in sub-Saharan Africa, 
according to an online survey 
of 412 academics that spanned 
6 countries. Eighty-five per cent 


of respondents report having had 


research positions with no pay. 


Of those, 33% had spent between 


1 and 5 years doing research for 
free, and 4% had spent more 
than 5 years doing so. 

Many researchers on 
the continent work with 
organizations out of personal 
interest or to contribute to 
a specific cause, says Lem 
Ngongalah, head of the 
Collaboration for Research 
Excellence in Africa in Douala, 


the results on 17 October 

(L. Ngongalah et al. Preprint 

at bioRxiv http://doi.org/cwr9; 
2018). “In many cases, there is no 
payment for such work, because 
the organization itself has no 
funding for the work it does,’ she 
says. “Funding continues to be a 
significant challenge for research 
in Africa? 

The respondents — who come 
from Cameroon, Nigeria, Kenya, 
Uganda, Tanzania and South 
Africa — also identified other 
barriers to conducting research 
in Africa, including a shortage 
of training facilities and a loss of 
motivation to continue research. 


hope that the detector will 
help them to elucidate the 
relative masses of neutrinos, 
elusive particles produced 
when cosmic rays strike 

the atmosphere. India’s 
government approved funding 
for the project in 2015, but 

the experiment has been 
opposed by environmentalists 
and local politicians. Before 
construction can begin, 

the project must also gain 
approvals from wildlife and 
pollution boards. 


‘Noteworthy’ figure 


A scientist will grace Britain's 
next £50 banknote — and 
the public has the chance to 
suggest who it should be. The 
Bank of England is looking 
for a UK scientist to replace 
the industrialists Matthew 
Boulton and James Watt on 
the next iteration of the note, 
which will be printed ona 
polymer material for the 


UNPAID RESEARCH IN AFRICA 


SEVEN DAYS | THIS WEEK | 


first time. Nominations can 

be made through the bank's 
website until 14 December, and 
proposed scientists must be 
dead. An advisory committee 
— including four scientists 

— will then whittle down the 
list to a few names. The bank’s 
governor, Mark Carney, will 
make the final decision, which 
will be announced in 2019. See 
page 156 for more. 


Sanger probe 

The Wellcome Sanger 
Institute in Hinxton, UK, 

has dismissed allegations 

that its high-profile director, 
geneticist Mike Stratton, 
bullied staff, discriminated 
against them because of their 
gender and misused funds. On 
30 October, the institute, one 
of the world’s top genomics 
centres, announced that an 
investigation had cleared 
Sanger'’s management, in 
particular Stratton, of these 
accusations. Carried out by 
the barrister Thomas Kibling 
at Matrix Chambers, London, 
the investigation did identify 
“failings in the way in which 
people have been managed”, 
and a lack of diversity at senior 
levels of the organization. 

Ina statement, Stratton 
apologized for “failures in 
people management that 

have occurred and have had 
unintended detrimental effects 
on individuals” 


Some 85% of 412 academics and students in 6 countries reported 
having been in research positions with no pay at some point in their 
career. Academics cite a lack of funds for research as one of the main 


Causes. 


NO UNPAID RESEARCH 
EXPERIENCE 15.5% 


UNPAID RESEARCH 


EXPERIENCE 84.5% 


__ Less than 6 months 
23.3% 


__ 6 months - 1 year 
24.3% 


— 1-3 years 26.2% 


— 3-5 years 6.8% 
— >5 years 3.9% 
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ROBIN LOZNAK/ZUMA WIRE 


NEWS IN FOCUS 


Cost of South Africa’s 
invasive species laid bare in 
government report p.164 


Al spots natural 
selection at work in the 
human genome p.167 


‘Cow’ supernova 


Enthralling 


divulges its secrets p.168 


No ate 


The quest 


%, 7 to create cells from 
ti: 


scratch p.171 


A group of children and young people (shown here with lawyer Julia Olson) is suing the US government to force stronger action on climate change. 


Historic kids’ climate 
lawsuit gets green light 


Young people claim US government has violated their rights by failing to avert warming. 


BY EMMA MARRIS 


landmark climate-change lawsuit 
Are by young people against the 

US government can proceed, the 
Supreme Court said on 2 November. The case, 
Juliana v. United States, had been scheduled to 
begin trial on 29 October in Eugene, Oregon, 
ina federal district court. But those plans were 
scrapped last month, after President Donald 


Trump's administration asked the Supreme 
Court to intervene and dismiss the case. 

The plaintiffs, who include 21 people ranging 
in age from 11 to 22, allege that the govern- 
ment has violated their constitutional rights to 
life, liberty and property by failing to prevent 
dangerous climate change. They are asking the 
district court to order the federal government 
to prepare a plan that will ensure that the level 
of carbon dioxide in the atmosphere falls below 


8 NOVEMBER 2018 | 


© 2018 Springer Nature Limited. All rights reserved. 


350 parts per million by 2100, down from an 
average of 405 parts per million in 2017. 

By contrast, the US Department of Justice 
argues that “there is no right to ‘a climate sys- 
tem capable of sustaining human life” — as the 
Juliana plaintiffs assert. 

Although the Supreme Court has now 
denied the Trump administration's request to 
dismiss the case, the path ahead is unclear. In 
its 2 November order, the Supreme Court 


VOL 563 | NATURE | 163 


| NEWS IN FOCUS 


> suggested that a federal appeals court 
should consider the administration's argu- 
ments before any trial starts in the Oregon 
district court. Lawyers for the young people 
said that they would push the district court to 
reschedule the trial this week. 

“The youth of our nation won an important 
decision today from the Supreme Court that 
shows even the most powerful government in 
the world must follow the rules and process of 
litigation in our democracy,’ said Julia Olson, 
co-counsel for the plaintiffs, in a statement. 

Although climate change is a global problem, 
lawyers around the world have brought cli- 
mate-change-related lawsuits against local and 
national governments and corporations since 
the late 1980s. These suits have generally sought 
to force the sort of aggressive action against cli- 
mate change that has been tough to achieve 
through political means. 

Many of the cases have failed, but in 2015, 
a citizen's group called the Urgenda Founda- 
tion won a historic victory against the Dutch 
government. The judge in that case ordered 
the Netherlands to cut its greenhouse-gas 
emissions to at least 25% below 1990 levels by 
2020, citing the possibility of climate-related 
damages to “current and future generations 
of Dutch nationals” and the government’s 
“duty of care ... to prevent hazardous climate 
change”. A Dutch appeals court upheld the 
verdict last month. 

Over the past few years, the Dutch case 
has emerged as a model for climate lawsuits 
in other countries, says Gillian Lobo, a law- 
yer who specializes in climate-change-related 
cases at ClientEarth in London. More recently, 
she says, the Juliana lawsuit has inspired 


its own copycats — some of which have 
progressed further than Juliana itself. “It is a 
global phenomenon,’ Lobo says. 

One case modelled on the Juliana lawsuit 
has already produced a striking victory. In 
January, 25 young people sued the Colombian 
government for their right to a healthy envi- 
ronment, in a case called Demanda Genera- 
ciones Futuras v. Minambiente. 

The Colombian Supreme Court found in 
the plaintiffs’ favour 


in April. Notonlydid “Weneedto 

it order the govern- win as soon as 
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Colombian Amazon 
rainforest is “a subject 
of rights” that is entitled to “protection, conser- 
vation, maintenance and restoratiom. 

The young plaintiffs in the Juliana case 
allege that they have already suffered harm 
from climate change. Seventeen-year-old 
Jaime and her family left their home on the 
Navajo Nation Reservation in Cameron, Ari- 
zona, in 2011 because the springs that supplied 
their water were drying up. Fifteen-year-old 
Jayden’s home in Louisiana was severely dam- 
aged by flooding in 2016, and 19-year-old Vic's 
school in White Plains, New York, closed tem- 
porarily in 2012 after Hurricane Sandy hit. 

US climate hawks hope that the Juliana 
plaintiffs will ultimately prevail, but President 
Trump’s administration is mounting a mul- 
tipronged defence. The Justice Department 
denies that the district court in Oregon has 
jurisdiction over the broad sweep of federal 


stronger case.” 


policies at issue, and that the rights to life, 
liberty and property set out in the Constitu- 
tion translate into the right to a stable climate. 
In any case, the department argues, no mean- 
ingful redress is possible, given that sharp cuts 
in US emissions might not move the needle 
on climate change much if other countries’ 
greenhouse-gas output grows. 

Andrea Rodgers, co-counsel for the Juliana 
plaintiffs, says that the Trump administration 
hasn't challenged the fact that humans are 
changing the climate. “They haven't presented 
experts to contest what our scientists are say- 
ing about ice melt or sea-level rise or terres- 
trial impacts or how climate change happens 
or ocean acidification,” she says. 

To win, Rodgers says, “we have to show that 
the United States government is liable, but also 
that there is a remedy that the judge can order”. 
The United States has seen its greenhouse-gas 
emissions drop in recent years, as the coun- 
try shifts its energy mix away from coal and 
towards renewable sources, but as of 2016, it 
remains the second-largest emitter after China. 

James Hansen, a climatologist at Columbia 
University in New York City and a long-time 
climate activist, is an expert witness in the case 
—and a plaintiff, representing “future genera- 
tions” not yet born. (His 20-year-old grand- 
daughter Sophie Kivlehan is also a plaintiff.) 

Hansen has been fighting for action on 
climate change since he first testified on the 
subject before the US Senate in 1988. He says 
that if the Juliana plaintiffs lose their case, he 
will simply try another way. “We need to win 
as soon as possible,’ Hansen says, “but if we 
lose, we don’t give up — we come back with a 
stronger case.” m 


South Africa’s invasive species guzzle 
water and cost US$450 million a year 


The country’s first report on its biological invaders is pioneering in scope, and paints a dire 
picture for resources and biodiversity. 


BY SARAH WILD 


Cr Africa is losing its battle against 


biological invaders, according to the gov- 
ernment’s first attempt to comprehensively 
assess the status of the country’s alien species. 
The invaders, including forest-munching 
wasps and hardy North American bass, cost the 
country around 6.5 billion rand (US$450 mil- 
lion) a year and are responsible for about 
one-quarter of its biodiversity loss. That’s the 
conclusion ofa pioneering report (see go.nature. 
com/2qmwgag) that the South African National 


Biodiversity Institute in Pretoria released on 
2 November. 

Invasive species also guzzle water, a 
serious problem in a country suffering from 
a prolonged and catastrophic drought that is 
expected to worsen as the climate changes. 

The report, which the institute compiled 
in response to 2014 regulations that mandate 
a review of invasive species every three years, 
examines the pathways by which these species 
enter the country and the effectiveness of inter- 
ventions. It also weighs the toll they take on the 
nation’s finances and biodiversity. 
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This achievement constitutes a “significant 
advance” compared with efforts by most other 
countries, says Piero Genovesi, who chairs the 
invasive species specialist group of the Inter- 
national Union for Conservation of Nature in 
Rome. He says that other reports have looked 
at the impact of biological invasions, or at 
measures to address the problem, but have not 
considered all aspects of invasions. 

The report provides “an incredible basis” on 
which to deal with invasive species in South 
Africa, says Helen Roy, an ecologist at the Cen- 
tre for Ecology and Hydrology near Oxford, UK. 


MARK MOFFETT/MINDEN PICTURES/ALAMY 


The invasive ant Linepithema humile disrupts seed dispersal in indigenous South African plants. 


Across the world, invasive species — 
organisms that have been introduced into 
ecosystems beyond their natural habitats, and 
that spread over large distances on their own 
— are considered a major threat to biodiversity, 
human health and economies. Climate change 
is expected to further their global spread, in part 
by reducing the resilience of native ecosystems. 

To create the report, in 2015, 37 research- 
ers from 14 national organizations, led by the 
National Biodiversity Institute and the Centre 
of Excellence for Invasion Biology at Stellen- 
bosch University, began collating data from 
institutions around the country. 


MAJOR IMPACTS 

The researchers report that 7 new species are 
introduced into South Africa each year, and 
that about 775 invasive species have been 
introduced so far. This contrasts with the 
556 invasive species previously reported by 
the government. Most of the species identi- 
fied by the latest report are plants, with insects 
the next most common. (For comparison, 
the United Kingdom says that it has 184 non- 
native invasive species.) The report’s authors 
consider 107 of the species in South Africa to 
have major impacts on biodiversity or human 
well-being. 

Invaders of note include trees in the Prosopis 
genus, such as honey mesquite (P. glandulosa), 
which damages animal grazing areas, out- 
competes local plants and, according to a 
2017 study in Mali, seems to encourage the 
growth of populations of the malaria-carry- 
ing Anopheles mosquito, among other effects 
(G. C. Muller et al. Malar. J. 16, 237; 2017). 

Others include the Sirex wasp (Sirex 
noctilio), which threatens South Africa's 
16-billion-rand forestry industry; the ant 


Linepithema humile, from Argentina, which 
disrupts seed dispersal in indigenous plants; 
the North American small-mouth bass 
(Micropterus dolomieu), which has outcom- 
peted indigenous fish; and the water hyacinth 
(Eichhornia crassipes), from South America, 
which chokes the country’s waterways. 

As well as their cost and toll on biodiversity, 
the report explores the pressure that invasive 
species put on the water supply. This year, 
Cape Town almost became the first major 
city in the world to run out of water. In May, 
researchers argued that alien plants, which 
often use more water than do indigenous ones, 
consumed more than 100 million litres of 
water a day — about one-fifth of the city’s daily 
usage — and that water losses due to invasive 
species could triple by 2050. The report esti- 
mates that invasive trees and shrubs, if left 
unchecked, could threaten up to one-third of 
the water supply to cities such as Cape Town, 
and consume up to 5% of the country’s mean 
annual rainfall run-off. 

Despite enacting the 2014 regulations and 
spending at least 1.5 billion rand a year to curb 
invasive species, the country is not keeping up, 
says the report. “The most concerning find- 
ing was how ineffective we have been,” says co- 
author Brian van Wilgen, an applied ecologist 
at Stellenbosch University. 

But the authors also note that their con- 
fidence in almost all their estimates is low, 
because of poor monitoring and evaluation 
data — and that more research into impacts 
and monitoring techniques is needed. 

Jasper Slingsby, an ecologist with the South 
African Environmental Observation Network 
in Cape Town, agrees. “We need better funding 
and concerted research effort in this space as a 
national priority,” he says. = 


8 NOV 


© 2018 Springer Nature Limited. All rights reserved. 


IN FOCUS | NEWS 


Australia cuts 
coral research 


Reef-science centre set to 
lose government funding. 


BY ADAM MORTON 


cean researchers around the world 
() are dismayed that an Australian 

research institute that has become 
an international authority on the declining 
health of reef ecosystems will lose most of 
its government funding after 2021. 

Papers by scientists at the Centre of 
Excellence for Coral Reef Studies, based 
at James Cook University in Townsville, 
were cited almost 40,000 times in 2017 
— the most citations for any institute in 
the world doing reef science. But in late 
October, it emerged that the Australian 
Research Council (ARC), an independent 
government agency, had not shortlisted the 
centre to receive a share of the latest round 
of funding. The ARC has funded the centre 
since its inception 13 years ago. 

The centre will lose 37% of its current 
annual budget of about Aus$12 million 
(US$8.7 million), and its title as an ARC 
centre of excellence. James Cook University 
says it is committed to delivering world-class 
coral-reef research into the future, but has 
not explained how the centre will be sup- 
ported. The centre’ director, Terry Hughes, 
declined to comment on the decision. 

Scientists fear job losses and a reduced 
research capacity are to come. They say 
the centre’s work is important to people 
living alongside reefs across the tropics. “It 
is deeply stupid for Australia not to fund, 
or even consider funding, its world-leading 
coral-reef research,’ says Garry Peterson, an 
environmental scientist at the Stockholm 
Resilience Centre. 

The coral-reef centre employs about 
300 scientists. Its most celebrated work, 
which established the extent of recent 
bleaching along the Great Barrier Reef 
(T. P. Hughes et al. Nature 543, 373-377; 
2017), involved aerial surveys and 100 divers. 

Some researchers link the ARC’s decision 
to the Australian government's failure to 
adequately address climate change, which is 
the greatest threat to coral reefs. “A different 
government with a different outlook would 
have found a way to support that centre,” 
says physicist Bill Hare, chief executive of 
the climate-research and policy institute 
Climate Analytics in Berlin. 

But ARC chief executive Sue Thomas 
says that the decision was based on a 
standard competitive process. m 
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ATMOSPHERIC SCIENCE 


Towering thunderstorms regularly roll over central Argentina. 


Inside Argentina’s 
mega-storms 


Massive project aims to improve severe -weather predictions 
in shadow of the Andes mountains. 


BY ALEXANDRA WITZE 


planet are about to give up their secrets. 

Deadly downpours, grapefruit-sized 
hail and severe lightning regularly pepper 
the eastern side of the Andes mountains in 
Argentina. These storms often flood towns 
and destroy the vineyards of the region’s wine 
industry, but remain poorly understood. About 
160 atmospheric scientists — mostly from the 
United States, Argentina and Brazil — have 
descended on central Argentina to change that. 

Their ultimate goal is to improve severe- 
weather warnings, so that people know to 
avoid areas where flash floods are likely, or to 
prepare their vineyards for a hailstorm. 

The US$30-million project kicked into high 
gear on 1 November, as researchers headed to 
the centre of the country with storm-chasing 
equipment, including radar scanners mounted 
on trucks. The atmospheric-sciences experi- 
ment, called Remote sensing of Electrification, 


ome of the worst thunderstorms on the 


Lightning, And Mesoscale/microscale 
Processes with Adaptive Ground Observations 
(RELAMPAGO, which is Spanish for light- 
ning), is the biggest of this type ever conducted 
outside the United States. 

“It’s the craziest activity I have ever been 
in in my life,” says 
Paola Salio, an 
atmospheric scien- 
tist at the University 
of Buenos Aires and 
the Argentina lead 
on the project. “But 
it is also like a dream 
come true.” 

From now until mid-December, the 
scientists hope to chase at least a dozen severe 
storms to study air temperature, wind speed 
and direction, rainfall amounts, the number 
of lightning strikes and other factors. They 
want to use those data to improve models of 
how descending air on the eastern side of the 
Andes triggers towering thunderstorms that 
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regularly reach 18 kilometres into the atmos- 
phere. Such storms are more powerful than 
typical thunderstorms elsewhere, which might 
grow 12 kilometres high. 

The lines of thunderstorms that often form 
along the Andes look very similar to the ones 
in the central United States that usually pro- 
duce tornadoes. But the Argentinian storms 
are larger and, for some reason, don't spawn 
tornadoes nearly as often as the US storms do. 

“That’s one of the mysteries we want to 
answer, why there are so few tornadoes,” says 
Steve Nesbitt, an atmospheric scientist at the 
University of Illinois at Urbana-Champaign 
who heads RELAMPAGO. 

In addition, the researchers will drive 
hundreds of kilometres southwest of their base 
near Cordoba to target systems that produce 
strong hail in Mendoza province. 

A second, related project called CACTI 
(Cloud, Aerosol, and Complex Terrain Interac- 
tions) will focus on how atmospheric particles 
such as dust or haze influence storm develop- 
ment. Funding for both projects comes from 
national research agencies and institutions 
in the United States — such as the National 
Science Foundation and the Department of 
Energy — Argentina and Brazil. 

The work would not have been possible 
a few years ago, before Argentina beefed up 
its national weather radar system. Workers 
installed the first of the upgraded radars in 
Cérdoba in 2015, says Celeste Saulo, director 
of Argentina's weather service in Buenos Aires. 
There are seven other such radars operating 
around the country, and three more should be 
up and running by December, she adds. 

RELAMPAGO scientists plan to compare 
the data from the Cordoba radar with those 
from their truck-based instruments — which 
can reach more rural areas and capture addi- 
tional information on how storms grow — to 
gain a better picture of how severe weather 
works in central Argentina. 

During the project, the weather service 
will test a type of forecasting system that 
continually ingests updated weather data to 
improve forecasts. It’s similar to ones used 
by meteorologists in the United States and 
Europe. Argentina's weather agency wants to 
use the system going forward, Saulo says. 

RELAMPAGO could even provide a glimpse 
of the future, says Kristen Rasmussen, an 
atmospheric scientist at Colorado State Uni- 
versity in Fort Collins. As global temperatures 
rise, the warming atmosphere will provide 
more energy to feed thunderstorms around 
the world. Rasmussen's computer simulations 
show that those changes could result in storms 
similar to the powerful ones now seen in 
Argentina (K. L. Rasmussen and R. A. Houze 
Jr Mon. Weather Rev. 144, 2351-2374; 2016). 

“What we're seeing in South America 
could be more like what we will see in a future 
climate,” she says. This means that other parts 
of the world could soon get a taste of the storms 
that Argentina knows so well. m 
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Deep learning spots natural 
selection at work 


Scientists use artificial intelligence to hunt for genetic sequences moulded by evolution. 


BY AMY MAXMEN 


genome is evolving can be like hunting 

for a needle in a haystack. Each person's 
genome contains three billion building blocks 
called nucleotides, and researchers must com- 
pile data from thousands of people to discover 
patterns that signal how genes have been 
shaped by evolutionary pressures. 

To find these patterns, a growing number 
of geneticists are turning to a form of machine 
learning called deep learning. Proponents 
of the approach say that deep-learning 
algorithms incorporate fewer explicit assump- 
tions about what the genetic signatures of 
natural selection should look like than do 
conventional statistical methods. 

“Machine learning is automating the ability 
to make evolutionary inferences,” says Andrew 
Kern, a population geneticist at the University 
of Oregon in Eugene. “There is no question 
that it is moving things forward.” 

One deep-learning tool called DeepSweep, 
developed by researchers at the Broad Institute 
of MIT and Harvard in Cambridge, Massachu- 
setts, has flagged 20,000 single nucleotides for 
further study. These simple mutations might 
have helped humans to survive disease, drought 
or what Charles Darwin called the “conditions 
of life’, researchers reported last month at the 
annual meeting of the American Society of 
Human Genetics in San Diego, California. 

Since the 1970s, geneticists have created 
mathematical models to describe the finger- 
print of natural selection in DNA. Ifa muta- 
tion arises that renders a person better able to 
survive and produce offspring than their 
neighbours, the percentage of the population 
with that gene variant will grow over time. 

One example is the mutation that gives 
many adults the ability to drink cow’s milk. It 
enables the body to produce lactase, an enzyme 
that digests the sugar in milk, into adulthood. 
By analysing human genomes using statis- 
tical methods, researchers discovered that 
the mutation spread rapidly through Europe 
thousands of years ago — presumably because 
nutrients in cow’s milk helped people to pro- 
duce healthy children’. Today, nearly 80% of 
people of European descent carry this variant. 

Yet geneticists have struggled to identify, 
and confirm, other specific snippets of the 
genome that spread throughout populations 


Pp inpointing where and how the human 


because they provided an adaptive edge. Deep 
learning excels at just this sort of task: discov- 
ering subtle patterns in large amounts of data. 

But there is a catch. Deep-learning algo- 
rithms often learn to classify information 
after being trained by exposure to real data; 
Facebook, for example, primes algorithms to 
recognize faces using pictures that people have 
already labelled. Because geneticists don't yet 
know which parts of the genome are being 
shaped by natural selection, they must train 
their algorithms on simulated data. 

To generate that simulated data, researchers 
need to imagine what the signature of natural 
selection looks like, says Sohini Ramachan- 
dran, a population geneticist at Brown Uni- 
versity in Providence, Rhode Island. “We don't 
have ground-truth data, so the worry is that we 
may not be simulating properly.” 

And because deep-learning algorithms 
operate as black boxes, it’s hard to know what 
criteria they use to identify patterns in data, 
says Philipp Messer, a population geneticist 
at Cornell University in Ithaca, New York. “If 
the simulation is wrong, it’s not clear what the 
response means,” he adds. 

Researchers who use deep-learning algo- 
rithms do try to peek into the black box. 
DeepSweep’s creators trained the algorithm 
on signatures of natural selection that they 
inserted into simulated genomes. When they 
tried it on real genomes, the algorithm zeroed 
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in on the mutations that allow adults to drink 
milk. That bolstered the team’s confidence in 
the tool, says Joseph Vitti, a computational 
geneticist at the Broad Institute who helped to 
develop DeepSweep. 

The researchers then sifted through data 
from the 1000 Genomes Project — an initia- 
tive that sequenced DNA from 2,504 people 
around the world — using a statistical method 
to identify regions that might be under evolu- 
tionary pressure. When DeepSweep examined 
these areas more closely, it delivered a list of 
20,000 single mutations to explore. 

In the coming months, Vitti and his col- 
leagues will investigate what these mutations 
do by editing them in the DNA of living cells, 
to compare what happens when the mutations 
are present with when they are not. 

Several other researchers are training deep- 
learning algorithms to search for signs of adap- 
tation in genomes. A deep-learning model 
developed by Kern suggests that, at first, most 
mutations in humans are neither beneficial nor 
harmful’. Rather, they seem to drift along in 
populations, increasing natural genetic vari- 
ability, and become more frequent only when 
a change in the environment gives people with 
a particular mutation an evolutionary edge. 

In February, Ramachandran and her col- 
leagues reported’ on a deep-learning algo- 
rithm they developed, called SWIF(r). When 
they applied it to the genomes of 45 members 
of the Khomani San ethnic group in south- 
ern Africa, it flagged variations near genes 
associated with metabolism. The researchers 
speculate that the changes could have occurred 
thousands of years ago and helped members of 
the group to store fat when food was scarce. 

The effects of the mutations still need to 
be tested. But, as with the variants spotted 
by DeepSweep, the candidates singled out by 
SWIF(r) provide scientists with a place to start. 

“These are incredibly powerful methods for 
looking for the signals of natural selection,” 
says Pardis Sabeti, a computational geneticist 
at the Broad Institute, and Vitti’s PhD super- 
visor. “Some people didn't think you could 
pinpoint variants when I started. Some 
thought it was impossible.” m 
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Mauna Kea, the planned site of the Thirty Meter Telescope, is sacred to many Native Hawaiians. 


ASTRONOMY 


Embattled telescope 
scores big win 


Hawaii’s supreme court rules that the Thirty Meter 
Telescope’s construction permit is valid. 


BY ALEXANDRA WITZE 


awaii’s supreme court has ruled in 
H favour of building the Thirty Meter 

Telescope (TMT) atop the mountain 
Mauna Kea. The decision removes the last legal 
hurdle preventing the US$1.4-billion project 
from resuming construction. 

“This clears the way for the TMT to begin 
construction,” says Doug Simons, executive 
director of the Canada-France-Hawaii Tele- 
scope, which is located on Mauna Kea. “So, 


yeah, it’s a really big deal” 

For years, the next-generation astronomical 
observatory has been mired in public protests 
and legal challenges. Some Native Hawaiians 
say that building the mega-telescope would 
further desecrate a sacred mountain that is 
already home to multiple observatories. In 
April 2015, protesters blocked the road to 
Mauna Kea’s summit as construction of the 
TMT was set to begin. That December, the 
state supreme court revoked the project’s 
construction permit, saying that the state 


government had granted it before opponents 
of the telescope could have their full say. 

Hawaii’s Board of Land and Natural 
Resources issued a fresh construction permit 
in September 2017, prompting opponents to 
appeal. The latest ruling upholds that permit. 

A separate legal issue, involving the Uni- 
versity of Hawaii's sublease of land on Mauna 
Kea for the TMT site, was resolved in August. 
The state supreme court ruled in the project’s 
favour in that case, as well. 

TMT opponents have few legal options; they 
include petitioning the US Supreme Court. 

One of the groups opposing the TMT, the 
environmental advocacy organization KAHEA 
in Honolulu, said it was “disappointed” by the 
latest ruling. “Thousands of Hawaiian cultural 
practitioners have affirmed the sacredness of 
the entirety of Mauna Kea,” the group said in 
a statement. 

TMT officials have been considering an 
alternative site for the telescope, in Spain's 
Canary Islands, in case they cannot resolve 
the obstacles to building in Hawaii. It could 
take months before project leaders decide 
whether to go ahead in Hawaii, now that they 
have the supreme court’s backing. Among the 
issues they face is how to restart construction 
on Mauna Kea, given the protests that broke 
out the last time they tried to do so. 

“We remain committed to being good 
stewards on the mountain and inclusive of the 
Hawaiian community,’ said Henry Yang, chair 
of the TMT International Observatory board 
of governors, in a statement. 

In Hawaii, the battle over how Mauna Kea 
is used may soon shift from the TMT to the 
University of Hawaii’s master lease, which cov- 
ers all land on the mountain that is used for 
astronomical observatories. The lease expires 
in 2033, and Shelley Muneoka, a representative 
of KAHEA, says that the group is considering 
a challenge to the lease’s renewal. m 


TMT INTERNATIONAL OBSERVATORY 


ASTROPHYSICS 


Mystery supernova known 
as ‘Cow’ spills its secrets 


A superbright explosion in a galaxy far, far away has drawn astronomers’ full attention. 


BY DAVIDE CASTELVECCHI 


or many astronomers, 2018 will be 
Peeenenberea as the year of the cow — 
after the nickname ofa spectacular stellar 
explosion that has kept them busy for months. 


The unusual event has offered an unprece- 
dented window on to the collapse of a star, two 


teams of researchers suggest in papers submit- 
ted to the arXiv preprint server on 25 Octo- 
ber’”. In contrast to the slow ramp-up of a 
typical supernova, Cow became stupendously 
bright essentially overnight, leaving astrono- 
mers perplexed. “It popped up out of nowhere,” 
says Stephen Smartt, an astronomer at Queen's 
University Belfast, UK, who discovered the 
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explosion on 16 June. He named it according to 
an alphabetical protocol that just happened to 
spell out the word ‘Cow’ — technically, it is the 
event AT2018cow. 

Tair Arcavi, an astrophysicist at the University 
of California, Santa Barbara, says that “pretty 
much everything about its emission is some- 
thing we haven't seen before.” This is “the 


dream” for those who study stellar explosions, 
adds Raffaella Margutti, an astrophysicist at 
Northwestern University in Evanston, Illinois, 
who led one of the teams behind the two papers. 

The two groups behind the latest papers 
arrived independently at the same conclusion: 
that a ‘central engine’ has kept agitating the 
exploding star for months, and that the energy 
must have come from either a newly formed 
black hole in the process of accreting matter, or 
the frenetic rotation of a neutron star. 

Both black holes and neutron stars are born 
when massive stars reach the end of their lives. 
Explosions such as Cow could provide direct 
evidence of this type of birth, says Mansi Kasli- 
wal, an astronomer at the California Institute 
of Technology (Caltech) in Pasadena. “I think 
this is telling us about how to understand the 
most extreme incarnations of massive-star 
explosions.’ Arcavi is impressed by the quality 
of the observations and the latest results, but, 
he says, “there's still no bottom line as to what 
this is”. For now, Cow remains a mystery. 

After the initial discovery, Smartt traced Cow 
to a galaxy called CGCG 137-068 known to be 
around 60 megaparsecs (200 million light years) 
away. And this was no ordinary supernova: it 
reached its peak brightness in days, not weeks. 
“Everybody put down what they were doing up 
to that point” and started following Cow, says 
Daniel Perley, an astrophysicist at Liverpool 


John Moores University, UK. Perley and his 
collaborators commanded a robotic telescope 
on La Palma, one of Spain’s Canary Islands, to 
observe Cow nearly every night for a month and 
a half. They also used other telescopes around 
the globe that belong to a network Kasliwal 
designed for this kind of follow-up study. 


CURIOUSER AND CURIOUSER 

The evidence that the team gathered — mostly 
in the optical spectrum — suggested that an 
existing black hole is tearing a star apart, an 
observation they posted online’ in August. 
But to get the full picture, researchers needed 
to look at the spectrum of electromagnetic 
energy, from radio waves to y-rays. 

Just days after Smartt’s discovery, Anna Ho, 
another astronomer at Caltech, moved quickly 
to observe Cow in the radio spectrum. Ina stel- 
lar explosion, charged particles emit radio waves 
as they spiral inside strong magnetic fields, and 
their wavelengths stretch as the material spreads 
out. This happens quickly, so astronomers are 
unlikely to catch events early enough to see 
short-wavelength emissions. But with Cow, Ho 
realized that she might have a rare chance to 
observe wavelengths of one millimetre or less. 
Early observations in June by her group and 
others indeed found emissions in the sub-mil- 
limetre range, so she submitted an emergency 
proposal for observing time to the Atacama 
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Large Millimeter/submillimeter Array (ALMA) 
in the Chilean Andes. 

For weeks, Ho’s team watched the spectrum 
of the event’s millimetre emissions as it evolved. 
The researchers found that matter was expand- 
ing outward as fast as one-tenth of the speed of 
light’. But, unlike an ordinary supernova, this 
short-wavelength radiation lasted for weeks, 
revealing the presence of a central ‘engine’ — a 
black hole or a spinning neutron star. “We were 
able to show that it’s not consistent with any of 
the usual mechanisms,” Ho says. 

Margutti and her colleagues, meanwhile, 
took advantage of a proposal Margutti had 
made to observe ‘transient’ events using NASAs 
NuSTAR X-ray telescope. Observations of Cow 
on NuSTAR and other telescopes confirmed the 
event was highly unusual: X-ray spectra showed 
that it was being reheated from the inside. This, 
too, points to a black hole or neutron star at the 
centre — although it’s too soon to conclude 
which. Margutti hopes that astronomers will 
observe more of these events and so begin to pin 
down the conditions that lead to one outcome 
over another. “The game begins now.” m 


1. Margutti, R. et a/. Preprint at https://arxiv.org/ 
abs/1810.10720 (2018). 

2. Ho,A.Y. Q. etal. Preprint at https://arxiv.org/ 
abs/1810.10880 (2018). 

3. Perley, D. A. et a/. Preprint at https://arxiv.org/ 
abs/1808.00969 (2018). 
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BOTTOM-UP 
BIOLOGY 


Researchers are tearing up the 
biology rule books by trying to 
construct cells from scratch. A 
special issue explores the lessons 
being learnt about life. 


life — but that metaphor fails to capture 

their complexity. How do the multitudes 
of different molecules within a lipid envelope 
come together to carry out the functions 
required to sustain organisms? The standard 
approach in biology has been to work from 
the top down to study how cell components 
interact in their natural environment. But tech- 
nical advances now allow researchers to take a 
different tack: using engineering principles to 
reconstruct biological processes from the bot- 
tom up. This special issue explores the potential 
and possible limits of bottom-up cell biology. 

The ultimate goal for many is to construct 
an artificial cell from scratch. But there are vig- 
orous debates about how to build it and what 
functions would be required to constitute life. 
A News Feature on page 172 explores how 
researchers are working to develop components 
such as membranes and metabolic pathways 
and to piece them together into a whole. And 
an Editorial on page 155 reminds us to consider 
the responsibility that comes with creating arti- 
ficial life. 

Before they succeed in creating an artificial 
cell, researchers might be able to develop cell- 
like systems engineered for biomedical applica- 
tions. On page 177, Dan Fletcher, a bioengineer 
at the University of California, Berkeley, offers a 
wish list for such ventures that address pressing 
medical needs, including artificial blood cells 
and smart delivery vehicles for drugs. 

Some groups are extending the bottom-up 
approach beyond cell construction. On page 
203, Xavier Trepat at Spain’s Barcelona Institute 
for Science and Technology and his colleagues 
report that they have developed a system in 
which cells stretch and deform themselves in 
vitro in ways that have been seen only in metal 
alloys. Now we can investigate whether this 
property helps to shape tissues during develop- 
ment, say Manuel Théry and Atef Asnacios from 
Paris Diderot University in an accompanying 
News & Views article, on page 192. 

Like all scientific approaches, there are limits 
to what can be learnt from engineering biology. 
As a News & Views forum on page 188 high- 
lights, researchers disagree about how useful the 
approach is for studying biological phenomena 
that are governed by physical variables. The 
complexity of cells is precisely what makes it 
appealing to build one, piece by piece. m 


C ells are often called the building blocks of 
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BIOLOGY 
FROM SCRATCH 


BUILT FROM THE BOTTOM UP, SYNTHETIC CELLS 
COULD REVEAL THE BOUNDARIES OF LIFE. 


BY KENDALL POWELL 
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here were just eight 
ingredients: two pro- 
teins, three buffering 
agents, two types of 
fat molecule and some 
chemical energy. But that 
was enough to create a flotilla 
of bouncing, pulsating blobs — rudimentary 
cell-like structures with some of the machinery 
necessary to divide on their own. 

To biophysicist Petra Schwille, the dancing 
creations in her lab represent an important 
step towards building a synthetic cell from the 
bottom up, something she has been working 
towards for the past ten years, most recently 
at the Max Planck Institute of Biochemistry in 
Martinsried, Germany. 

“I have always been fascinated by this 
question, ‘What distinguishes life from non- 
living matter?” she says. The challenge, 
according to Schwille, is to determine which 
components are needed to make a living sys- 
tem. In her perfect synthetic cell, she'd know 
every single factor that makes it tick. 

Researchers have been trying to create arti- 
ficial cells for more than 20 years — piecing 
together biomolecules in just the right con- 
text to approximate different aspects of 
life. Although there are many such aspects, 
they generally fall into three categories: 
compartmentalization, or the separation 
of biomolecules in space; metabolism, the 
biochemistry that sustains life; and informa- 
tional control, the storage and management of 
cellular instructions. 

The pace of work has been accelerating, 
thanks in part to recent advances in micro- 
fluidic technologies, which allow scientists to 
coordinate the movements of minuscule cellu- 
lar components. Research groups have already 
determined ways of sculpting cell-like blobs 
into desired shapes; of creating rudimentary 
versions of cellular metabolism; and of trans- 
planting hand-crafted genomes into living 
cells. But bringing all these elements together 
remains a challenge. 


THE BUBBLE 
MACHINES 


Researchers use microfluidic 
chips to make lipid bubbles, or 
iposomes, which are similar to 
he envelopes that contain cells. 
One approach features a six-way 
junction that can fill liposomes 
with solution and pinch them 
off. With the fatty alcohol 
1-octanol in the mix, a lipid 
bilayer forms around the inner 
solution. Over time, excess lipids 
and 1-octanol pool at one end 
and spontaneously split off, 
eaving a fully formed liposome. 


Inner 
solution 


1-Octanol and 
dissolved lipids 


The field is, nevertheless, imbued with a new 
sense of optimism about the quest. In Septem- 
ber 2017, researchers from 17 laboratories in 
the Netherlands formed the group Building 
a Synthetic Cell (BaSyC), which aims to con- 
struct a “cell-like, growing and dividing sys- 
tem” within ten years, according to biophysicist 
Marileen Dogterom, who directs BaSyC anda 
laboratory at Delft University of Technology. 
The project is powered by an €18.8-million 
(US$21.3-million) Dutch Gravitation grant. 

In September, the US National Science 
Foundation (NSF) announced its first pro- 
gramme on synthetic cells, funded to the tune 
of $10 million. And several European inves- 
tigators, including Schwille, have proposed 
building a synthetic cell as one of the European 
Commission's Future and Emerging Technolo- 
gies Flagship schemes, which receive funding 
of €1 billion. 

Bottom-up synthetic biologists predict that 
the first fully artificial cells could spark to life 
in little more than a decade. “I’m pretty sure 
we'll get there,’ says Schwille. 


ALL IN THE PACKAGING 

Research groups have made big strides recreat- 
ing several aspects of cell-like life, especially in 
mimicking the membranes that surround cells 
and compartmentalize internal components. 
That’s because organizing molecules is key to 
getting them to work together at the right time 
and place. Although you can open up a billion 
bacteria and pour the contents into a test tube, 
for example, the biological processes would not 
continue for long. Some components need to 
be kept apart, and others brought together. 

“To me, it’s about the sociology of 
molecules,’ says Cees Dekker, a biophysicist 
also at Delft University of Technology. 

For the most part, this means organizing 
biomolecules on or within lipid membranes. 
Schwille and her team are expert membrane- 
wranglers. Starting about a decade ago, the 
team started adding Min proteins, which direct 
a bacterial cell’s division machinery, to sheets 
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of artificial membrane made of lipids. The 
Mins, the researchers found, would pop on 
and off the membranes and make them wave 
and swirl’. But when they added the Mins to 
3D spheres of lipids, the structures burst like 
soap bubbles, says Schwille. Her group and 
others have overcome this problem using 
microfluidic techniques to construct cell-sized 
membrane containers, or liposomes, that can 
tolerate multiple insertions of proteins — 
either into the membranes themselves or into 
the interior. 

Schwille’s graduate student, Thomas 
Litschel, and his collaborators dissolved the 
Min proteins in water and released droplets of 
the mixture into a rapidly spinning test tube. 
Centrifugal force pulls the droplets through 
layers of dense lipids that encapsulate them 
along the way. They come out at the other end 
as liposomes measuring 10-20 micrometres 
across — about the size of an average plant or 
animal cell. These liposomes, known as giant 
unilamellar vesicles (GUVs), can be made in 
different ways, but in Litschel’s hands, the Min 
proteins caused the GUVs to pulsate, dance 
around and contract in the middle’. 

Schwille’s group wants to capitalize on its 
knowledge of these proteins, which can pro- 
duce membrane patterns and self-organize. 
“We understand these molecules really well? 
she says. “We'd like to see how far we can get 
with relatively simple elements like the Mins.” 
Perhaps, as Litschel’s work hints, the team 
could use the proteins to mould membranes 
for division or to gather components at one 
end of a synthetic cell. Just as some physicists 
might use duct tape and tinfoil to fine-tune 
their experiments, Schwille says she hopes that 
these handy biological molecules will give her 
the ability to tinker with cell-like structures: 
“Tm an experimentalist to the bone.” 
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Dekker’s team members have also filled 
liposomes with their favourite proteins using a 
microfluidic chip (see “The bubble machines’). 
On the chip, two channels containing lipid 
molecules converge on a water-filled chan- 
nel and spit out cell-sized liposomes that can 
hold various biological molecules, either stuck 
through the membrane or free-floating inside 
the container’, 

His group has experimented with pressuriz- 
ing, deforming and reshaping the liposomes to 
take on non-spherical shapes that mimic cells 
better. Microfluidic devices give researchers 
more control to move, sort and manipulate 
liposomes using micro-channels that operate 
almost like circuits. This year, the Dekker lab 
designed a chip that could mechanically split 
a liposome in two by pushing it up against a 
sharp point’. 

“This, of course, is not what we are 
after — we want to demonstrate division from 
the inside, but it still tells us interesting infor- 
mation,’ says Dekker. Examples include the 
force it takes to divide a cell, and what types 
of physical manipulation the liposomes can 
tolerate. Along the same lines, his team has 
also played around with the shape of living 
Escherichia coli cells — making them wider or 
square by growing them in nanofabricated sili- 
cone chambers. In this way, team members can 
see how cell shape affects the division machin- 
ery, and assess how the Min proteins work in 
cells of different size and shape’. 

“We play with nanofabrication techniques 
and do things a normal cell biologist would 
never do,’ he says. “But a strange biophysicist 
like me can do this” 


ADDING ENERGY TO THE SYSTEM 

Now that it’s possible to add components to 
the liposome bubbles without popping them, 
groups can plan how to make molecules work 
together. Almost anything life-like requires 
cellular energy, usually in the form of ATP. 
And although this can be added from the out- 
side to feed a synthetic system, many biolo- 
gists working on bottom-up approaches argue 
that a true synthetic cell should have its own 
power plant, something similar to an animal 
cell's mitochondrion or a plant’s chloroplast, 
both of which make ATP. 

Joachim Spatz’s group at the Max Planck 
Institute for Medical Research in Heidelberg, 
Germany, has built a rudimentary mitochon- 
drion that can create ATP inside a vesicle. 

To do this, his team took advantage of new 
microfluidic techniques. First, they stabilized 
GUVs by placing them inside water-in-oil 
droplets surrounded by a viscous shell of 
polymers. Then, as these droplet-stabilized 
GUVs flowed down a microchannel, the team 
injected big proteins into them, either inside 
the vesicle or embedded in the membrane’s 
surface (see “The assembly lines’). 


They loaded these membranes with an 
enzyme called ATP synthase, which acts as a 
kind of molecular waterwheel, creating ATP 
energy from precursor molecules as protons 
flow through the membrane. By adding acid 
to boost protons outside the GUVs, the team 
drove ATP’s production on the inside®. 

Spatz explains that researchers could cycle 
the GUVs around the microchannel again for 
another protein injection, to sequentially add 
components. For instance, the next step could 
be to add a component that will automatically 
set up the proton gradient for the system. 

“That's an important module, like you have 
in real life? says Spatz. 

Another Max Planck synthetic-biology 
group led by biochemist Tobias Erb has been 
chipping away at other approaches to con- 
structing cellular metabolic pathways. He’s 
particularly interested in pathways that allow 
photosynthetic microbes to pull carbon diox- 
ide from the environment and make sugars 
and other cellular building blocks. 

Erb, a group leader at the Max Planck 
Institute for Terrestrial Microbiology in Mar- 
burg, Germany, takes a blank-slate approach 
to synthesizing cellular metabolic pathways. 
“From an engineering point of view, we think 
about how to design,” he says, “and then we 
build it in the lab” 

His group sketched out a system design that 
could convert CO, into malate, a key metabo- 
lite produced during photosynthesis. The team 
predicted that the pathway would be even 
more efficient than photosynthesis. Next, Erb 
and his team searched databases for enzymes 
that might perform each of the reactions. For 
a few, they needed to tweak existing enzymes 


THE ASSEMBLY LINES 


into designer ones. 

In the end, they found 17 enzymes from 
9 different organisms, including E. coli, an 
archaeon, the plant Arabidopsis and humans. 
The reaction, perhaps unsurprisingly, was 
inefficient and slow’. 

“We put a team of enzymes together that did 
not play well together,” says Erb. After some 
further enzyme engineering, however, the 
team has a “version 5.4” that Erb says operates 
20% more efficiently than photosynthesis. 

Expanding this work, Erb’s group has 
begun constructing a crude version of a syn- 
thetic chloroplast. By grinding up spinach 
in a blender, and adding its photosynthesis 
machinery to their enzyme system in the 
test tube, the biologists can drive the produc- 
tion of ATP and the conversion of CO, to 
malate — solely by shining ultraviolet light 
on it. 

Although everything can work for a brief 
time in a test tube, says Erb, “at the end, we 
would like it compartmentalized, like a chloro- 
plast”. He's excited to collaborate with synthetic 
biologists such as Kate Adamala, who can build 
and control complex compartments. 

Adamala’s group at the University of 
Minnesota in Minneapolis is working on 
ways to build programmable bioreactors, 
by introducing simple genetic circuits into 
liposomes and fusing them together to create 
more-complex bioreactors. She calls them 
“soap bubbles that make proteins”. 

Her group builds these bioreactors using 
a spinning tube system similar to Schwille’s, 
but which produces smaller liposomes. The 
researchers add circles of DNA called plasmids 
that they have designed to perform a particular 


A pico-injection system allows researchers to load cell-membrane-like compartments called 
liposomes with functional proteins. Liposomes are stabilized by a polymer coating and pushed 
through a microfluidic channel. As they pass over a pico-injection site, an electrical pulse can trigger 
the incorporation of internal proteins or membrane-bound proteins (as shown) into the liposomes. 


Ground electrode 


Polymer shell Lipid bilayer 
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function, along with all the machinery needed 
to make proteins from DNA. 

For instance, her group has made liposome 
bioreactors that can sense an antibiotic in 
their environment through membrane pores 
and can generate a bioluminescent signal in 
response’. 

By fusing simple bioreactors together 
sequentially, the team can construct more- 
complex genetic circuits. But the sys- 
tems start to break down as they expand 
to include ten or so components. This is a 
major challenge for the field, Adamala says. 
In a real cell, proteins that might interfere 
with each other’s actions are kept apart by 
a variety of mechanisms. For much simpler 
synthetic cells, biologists must find other 
ways to impose that control. This could be 
through external gatekeeping, in which the 
experimenter decides which liposomes get 
mixed together and when. It might also be 
accomplished through chemical tags that 
regulate which liposomes can fuse together, 
or through a time-release system. 


INFORMATIONAL INJECTIONS 

Another key to making a cell is getting the 
software right. Enabling a synthetic cell to 
follow scientists’ instructions and to replicate 
itself will require some way of storing and 
retrieving information. For living systems, this 
is done by genes — from hundreds for some 
microbes, to tens of thousands for humans. 

How many genes a synthetic cell will need 
to run itself is a matter of healthy debate. 
Schwille and others would like to keep it in the 
neighbourhood of a few dozen. Others, such 
as Adamala, think that synthetic cells need 
200-300 genes. 

Some have chosen to start with something 
living. Synthetic biologist John Glass and 
his colleagues at the J. Craig Venter Institute 
(JJCVI) in La Jolla, California, took one of the 
smallest-known microbial genomes on the 
planet, that of the bacterium Mycoplasma 
mycoides, and systematically disrupted its 


genes to identify the essential ones. Once they 
had that information, they chemically stitched 
together a minimal genome in the laboratory. 

This synthesized genome contained 
473 genes — about half of what was in the 
original organism — and it was transplanted 
into a related bacterial species, Mycoplasma 
capricolum’. In 2016, the team showed that this 
minimal synthetic genome could ‘boot up’ a 
free-living, although slow-growing organism”. 
Glass thinks that it will be hard to decrease that 
number much more: take any gene away, and 
it either kills the cells or slows their growth to 
near zero, he says. 

He and his JCVI colleagues are compiling a 
list of ‘cellular tasks’ based on the latest version 
of their creation, JCVI-syn3.0a, which could 
act as a blueprint of a cell’s minimal to-do 
list. But for about 100 of these genes, they 
can't identify what they do that makes them 
essential. 

As a next step, and supported by an NSF 
grant of nearly $1 million, Glass and Adamala 
will attempt to install the JCVI-syn3.0a 
genome into a synthetic liposome containing 
the machinery needed to convert DNA into 
protein, to see whether it can survive. In that 
case, both the software and the hardware of the 
cell would be synthetic from the start. 

If it could grow and divide, that would be a 
tremendous step. But many argue that to truly 
represent a living system, it would also have to 
evolve and adapt to its environment. This is 
the goal with the most unpredictable results 
and also the biggest challenges, says Schwille. 
“A thing that just makes itself all the time is not 
life — although I would be happy with that!” 
she says. “For a cell to be living, it needs to 
develop new functionality” 

Glass’s team at the JCVI has been doing 
adaptive laboratory evolution experiments 
with JCVI-syn3.0a, selecting for organisms 
that grow faster in a nutrient-rich broth. So 
far, after about 400 divisions, he and his team 
have obtained cells that grow about 15% faster 
than the original organism. And they have 
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seen a handful of gene-sequence changes 
popping up. But there’s no evidence yet of the 
microbe developing new cellular functions or 
increasing its fitness by leaps and bounds. 

Erb says that working out how to add 
evolution to synthetic cells is the only way to 
make them interesting. That little bit of messi- 
ness in biological systems is what allows them 
to improve their performance. “As engineers, 
we can't build a perfect synthetic cell. We have 
to build a self-correcting system that becomes 
better as it goes,” he says. 

Synthetic cells could lead to insights about 
how life might look on other planets. And 
synthetic bioreactors under a researcher's 
complete control might offer new solutions 
to treating cancer, tackling antibiotic resist- 
ance or cleaning up toxic sites. Releasing 
such an organism into the human body or 
the environment would be risky, but a top- 
down engineered organism with unknown 
and unpredictable behaviours might be even 
riskier. 

Dogterom says that synthetic living cells 
also bring other philosophical and ethical 
questions: “Will this be a life? Will it be auton- 
omous? Will we control it?” These conversa- 
tions should take place between scientists and 
the public, she says. As for concerns that syn- 
thetic cells will run amok, Dogterom is less 
worried. “I'm convinced our first synthetic 
cell will be a lousy mimic of what already 
exists.” And as the engineers of synthetic life, 
she and her colleagues can easily incorporate 
controls or a kill switch that renders the cells 
harmless. 

She and other synthetic biologists will keep 
pushing ahead exploring the frontiers of life. 
“The timing is right,” says Dogterom. “We 
have the genomes, the parts list. The minimal 
cell needs only a few hundred genes to have 
something that looks sort of alive. Hundreds 
of parts is a tremendous challenge, but it’s not 
thousands — that’s very exciting.” m 


Kendall Powell is a freelance science 
journalist in Lafayette, Colorado. 
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Children receiving blood transfusions in Bangladesh, where maintaining the supply from donors can be more challenging than in wealthy countries. 


Which biological systems 
should be engineered? 


To solve real-world problems using emerging abilities in synthetic biology, 
research must focus on a few ambitious goals, argues Dan Fletcher. 


r | he difference between tweaking and 
engineering is subtle but impor- 
tant. Scientists have been tweaking 

cells at the molecular scale for decades. 

In 1974, two researchers loaded DNA 

from a frog into a bacterium, prompting 

the microbe to produce a foreign RNA’. 

Twenty years later, scientists used a fluo- 

rescent protein from jellyfish to track 

gene expression in nematode worms, and 
to tag selected molecules in fruit flies’. 

The fluorescent components lit up under 


a microscope — kicking off a new era of 
watching cell biology in action. 

Now, biologists at the Allen Institute for 
Cell Science in Seattle, Washington, are 
tweaking the DNA of human stem cells 
to probe cell organization and function 
by replacing natural proteins with their 
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fluorescent counterparts (27 so far; see 
go.nature.com/2afaka5). Even physicians 
are getting in on the act, tweaking patients’ 
immune cells to improve the treatment of 
cancers, often with remarkable success’. 
In my view, engineering is something 
different. The ultimate goal of engineering 
is to construct systems that solve problems, 
such as a synthetic pancreas for people with 
diabetes. The systems must be planned in 
mechanistic detail — to achieve the desired 
function, and to minimize the risk of > 
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> failure or unintended consequences. 
That means building systems from the 
bottom up, with precise knowledge of all the 
component parts. In other words, engineer- 
ing begins when design enters the picture. 

Tweaking — fine-tuning a system 
through small changes — will continue to 
be an essential part of biological discovery 
and the development of new therapies. 
Engineering, by contrast, usually requires 
big teams, big budgets and narrow goals to 
achieve ambitious objectives through 
design. It also tends to lay bare how 
much (or how little) we know about 
controlling nature. 

Never has it been more possible 
to engineer biology (see “Tailor, not 
tinker’). But solving grand problems 
requires a switch from demonstrating 
that something is feasible in a labora- 
tory to homing in on a few ambitious 
goals. The time has come to decide 
where to focus this emerging ability 
to engineer biology — and to commit 
resources to doing it. 


AN ENGINEER’S WISH LIST 
So what should those goals be? 

In this discussion, I leave aside 
multicellular engineering projects, 
such as artificial tissues and organs, 
simply because it makes sense to 
start with something simpler. I have 
narrowed the scope of the projects I 
propose to those that could feasibly be 
achieved in the next decade with the 
right coordination, collaboration and 
support. And I focus on problems in 
human health, because this is an area I’ve 
thought most about. (Engineering plants to 
produce crops that are high yield, drought- 
and pest-resistant and environmentally 
friendly, including plant-based ‘meat, 
deserves its own separate discussion.) 

My ‘wish list’ is as follows: 

Artificial blood cells. Blood trans- 
fusions are crucial in treatments for 
everything from transplant surgery and 
cardiovascular procedures to car accidents, 
pregnancy-related complications and child- 
hood malaria (see go.nature.com/2ozbfwt). 
In the United States alone, 36,000 units of 
red blood cells and 7,000 units of plate- 
lets are needed every day (see go.nature. 
com/2ycr2wo). 

But maintaining an adequate supply of 
blood from voluntary donors can be chal- 
lenging, especially in low- and middle- 
income countries. To complicate matters, 
blood from donors must be checked exten- 
sively to prevent the spread of infectious 
diseases, and can be kept for only a limited 
time — 42 days or 5 days for platelets alone. 
What if blood cells could be assembled 
from purified or synthesized components 
on demand? 

In principle, cell-like compartments 


could be made that have the oxygen- 
carrying capacity of red blood cells or 
the clotting ability of platelets. The com- 
partments would need to be built with 
molecules on their surfaces to protect the 
compartments from the immune system, 
resembling those on a normal blood cell. 
Other surface molecules would be needed 
to detect signals and trigger a response. 

In the case of artificial platelets, that 
signal might be the protein collagen, to 


Human blood as viewed under a scanning electron microscope. 


which circulating platelets are exposed 
when a blood vessel ruptures’. Such com- 
partments would also need to be able to 
release certain molecules, such as factor V 
or the von Willebrand clotting factor. This 
could happen by building in a rudimentary 
form of exocytosis, for example, whereby a 
membrane-bound sac containing the mol- 
ecule would be released by fusing with the 
compartment’s outer membrane. 

It is already possible to encapsulate 
cytoplasmic components from living 
cells in membrane compartments®’. Now 
a major challenge is developing ways to 
insert desired protein receptors into the 
lipid membrane’, along with reconstitut- 
ing receptor signalling. 

Red blood cells and platelets are good 
candidates for the first functionally useful 
synthetic cellular system because they lack 
nuclei. Complex functions such as nuclear 
transport, protein synthesis and protein 
trafficking wouldn't have to be replicated. If 
successful, we might look back with horror 
on the current practice of bleeding one 
person to treat another. 

Designer immune cells. Immuno- 
therapy is currently offering new hope 
for people with cancer by shaping how 
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the immune system responds to tumours. 
Cancer cells often turn off the immune 
response that would otherwise destroy 
them. The use of therapeutic antibodies to 
stop this process has drastically increased 
survival rates for people with multiple 
cancers, including those of the skin, blood 
and lung’. Similarly successful is the tech- 
nique of adoptive T-cell transfer. In this, 
a patient’s T cells or those of a donor are 
engineered to express a receptor that targets 
a protein (antigen) on the surface of 
tumour cells, resulting in the T cells 
killing the cancerous cells (called 
CAR-T therapies)'°. All of this has 
opened the door to cleverly rewiring 
the downstream signalling that results 
in the destruction of tumour cells by 
white blood cells”. 

What if researchers went a step 
further and tried to create synthetic 
cells capable of moving towards, bind- 
ing to and eliminating tumour cells? 

In principle, untethered from 
evolutionary pressures, such cells 
could be designed to accomplish all 
sorts of tasks — from killing specific 
tumour cells and pathogens to remov- 
ing brain amyloid plaques or choles- 
terol deposits. If mass production of 
artificial immune cells were possible, 
it might even lessen the need to tailor 
treatments to individuals — cutting 
costs and increasing accessibility. 

To ensure that healthy cells are not 
targeted for destruction, engineers 
would also need to design complex 
signal-processing systems and safe- 
guards. The designer immune cells would 
need to be capable of detecting and mov- 
ing towards a chemical signal or tumour. 
(Reconstituting the complex process of 
cell motility is itself a major challenge, 
from the delivery of energy-generating 
ATP molecules to the assembly of actin 
and myosin motors that enable movement.) 

Researchers have already made cell-like 
compartments that can change shape”, and 
have installed signalling circuits within 
them”’. These could eventually be used to 
control movement and mediate responses 
to external signals. 

Smart delivery vehicles. The relative 
ease of exposing cells in the lab to drugs, 
as well as introducing new proteins and 
engineering genomes, belies how hard it 
is to deliver molecules to specific locations 
inside living organisms. One of the big- 
gest challenges in most therapies is getting 
molecules to the right place in the right cell 
at the right time. 

Harnessing the natural proclivity of 
viruses to deliver DNA and RNA molecules 
into cells has been successful”. But virus 
size limits cargo size, and viruses don’t nec- 
essarily infect the cell types researchers and 
clinicians are aiming at. Antibody-targeted 
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synthetic vesicles have improved the 
delivery of drugs to some tumours. But get- 
ting the drug close to the tumour generally 
depends on the vesicles leaking from the 
patient's circulatory system, so results have 
been mixed. 

Could ‘smart’ delivery vehicles contain- 
ing therapeutic cargo be designed to sense 
where they are in the body and move the 
cargo to where it needs to go, such as across 
the blood-brain barrier? 

This has long been a dream of those in 
drug delivery. The challenges are similar 
to those of constructing artificial blood 
and immune cells: encapsulating defined 
components in a membrane, incorporat- 
ing receptors into that membrane, and 
designing signal-processing systems to 
control movement and trigger release of 
the vehicle’s contents. 

The development of immune-cell 
‘backpacks’ is an exciting step in the 
right direction. In this, particles contain- 
ing therapeutic molecules are tethered to 
immune cells, exploiting the motility and 
targeting ability of the cells to carry the 
molecules to particular locations”. 

A minimal chassis for expression. In 
each of the previous examples, the engi- 
neered cell-like system could conceivably 
be built to function over hours or days, 
without the need for additional protein 
production and regulation through gene 
expression. For many other tasks, how- 
ever, such as the continuous production of 
insulin in the body, it will be crucial to have 
the ability to express proteins, upregulate or 
downregulate certain genes, and carry out 
functions for longer periods. 

Engineering a ‘minimal chassis’ that is 
capable of sustained gene expression and 
functional homeostasis would be an inval- 
uable starting point for building synthetic 
cells that produce proteins, form tissues 
and remain viable for months to years. This 
would require detailed understanding and 
incorporation of metabolic pathways, traf- 
ficking systems and nuclear import and 
export — an admittedly tall order. 

It is already possible to synthesize DNA 
in the lab, whether through chemically 
reacting bases or using biological enzymes 
or large-scale assembly in a cell'®. But we 
do not yet know how to ‘boot up’ DNA and 
turn a synthetic genome into a functional 
system in the absence of a live cell. 

Since the early 2000s, biologists have 
achieved gene expression in synthetic 
compartments loaded with cytoplasmic 
extract’’”. And genetic circuits of increasing 
complexity (in which the expression of one 
protein results in the production or degra- 
dation of another) are now the subject of 
extensive research. Still to be accomplished 
are: long-lived gene expression, basic pro- 
tein trafficking and energy production 
reminiscent of live cells. 


TAILOR, NOT TINKER 


Tools for engineering biological systems are in place 


Researchers have established the 
separation and characterization methods 
needed to identify almost all the parts of 
a single cell. They've also made strides 
in designing some desired functions and 
putting parts together in new ways. 
Thanks to the work of synthetic 
biologists in the early 2000s, gene circuits 
can be designed that use AND, OR and 
NAND logic gates (elementary signalling 
circuits)'®. They can also be designed 
to produce proteins that sense and kill 


RISK AND REWARD 

In ten years’ time, this wish list could seem 
either ridiculously myopic or foolishly 
ambitious. That is what makes this era of 
engineering biology so exciting. Whether 
or not these goals are reached, the attempt 
to build systems from known parts will 
focus our attention on the significant gaps 
in our understanding of how such systems 
work. 

Already, many of these ideas are being 
explored by researchers from diverse fields. 
They are often considered too risky to be 

embraced by con- 


“Whether or ventional funding 
not these goals sources, and are 
arereached, the _ thus relegated toa 
attempttobuild side project. 

systems from But risky ideas 
known parts only get the chance 
will focus our to become real 
attention.” through focused 


attention and 
effort, and that means giving them enough 
time and money. Some moves to pro- 
vide this are happening. The Max Planck 
Research Network in Synthetic Biology, a 
German collaboration, is funding efforts 
to identify the minimal building blocks 
of living systems. And in September, the 
US National Science Foundation launched 
a project to foster the engineering of syn- 
thetic cells under its Understanding the 
Rules of Life programme. 

More support is needed — specifically, 
from organizations and foundations with 
longer time horizons than those typical 
of industry or federal-grant providers. 
With sustainable funds and a willingness 
to embrace or at least accept the role of 
engineering biology in addressing societal 
challenges, we could build a world in which 
we trust artificial cells engineered to detect 
and treat the early signs of Alzheimer’s 
disease as much as we trust aeroplanes to 
land safely. 

To be clear, there is nothing wrong with 
tweaking biology. My lab will continue to 


tumours — including photoreceptive 
elements that act like pixels in a camera 
and capture photographs’’. 

Using genome-editing tools, yeast can 
be modified to produce biofuels, opiates or 
plant-free hop flavouring for beer*°. Even 
complete makeovers are possible: in 2016, 
researchers simplified the entire genome 
of the bacterium Mycoplasma mycoides, 
and incorporated this ‘minimal genome’ 
into cells that proved viable and were able 
to grow’®. DF. 


tweak, fiddle, futz and tinker as we pursue 
a deeper understanding of how cells organ- 
ize their membranes and cytoskeletons. But 
the time has come to focus, organize and 
set clear goals to solve big problems. The 
necessary tools are ready and the issues 
are pressing. Physicist Richard Feynman 
famously said: “What I cannot create, I do 
not understand.” For this era of design- 
ing biological systems, his quote should 
have a corollary: “What I cannot engineer, 
Ishould not use.” m 


Dan Fletcher is a professor of 
bioengineering and biophysics, and chair 
of the Department of Bioengineering at the 
University of California, Berkeley, USA. 
He is also a Chan Zuckerberg Biohub 
Investigator. 

e-mail: fletch@berkeley.edu 
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Workers prepare to bury the body of someone who died of Ebola in Sierra Leone in 2014. 


PUBLIC HEALTH 


Lessons from the 
Ebola front lines 


Nahid Bhadelia hails an analysis of the fraught 
campaign to contain the 2013-16 outbreak crisis. 


hen you work inside a system 
that you believe in, yet recognize 
as dysfunctional, it can be hard 


to offer a balanced critique. And the stakes 
are high when the system in question is the 
response to a humanitarian emergency, 
where public opinion determines crucial 
funding. Outbreak Culture, on the Ebola 
crisis of 2013-16, strikes this delicate balance 
expertly. 

The central theme of the book, by geneticist 
Pardis Sabeti and journalist Lara Salahi, is that 
common threads of dysfunction run through 
responses to epidemics. These emerge from 
the underlying architecture of societies 
(resulting from history, politics and culture), 
the competition between international organ- 
izations and (sometimes) the self-serving 
motivations of individuals in high-stress 
situations. We fracture along lines of existing 
weaknesses. 

Sabeti — who was part of the Ebola 
response effort — and Salahi show that 
the world is doomed to repeat the same 
mistakes with every new epidemic unless 


outbreak response 
shifts to a “mode that 
favors collaboration 
instead of competi- 
tion, and readiness 
instead of reaction”. 
My experience leads 
me to concur. I was a 
clinical responder in 
Sierra Leone during 


Outbreak Culture: 


a the Ebola epidemic in 
istheNex. both 2014 and 201531 
Epidemic now work in Uganda, 
PARDIS SABETI AND across the border 
LARA SALAHI from the current out- 
Harvard University break in the Demo- 
Press (2018) cratic Republic of the 


Congo. (Full disclo- 
sure: | appear briefly in the book.) 

Outbreak Culture is based on the cumula- 
tive accounts of more than 200 people who 
worked on the West African response, from 
clinicians and researchers to public-health 
workers and administrators. By bringing in 
these voices, the work distinguishes itself 
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from recent individual accounts. Yet it main- 
tains a personal tone. It also incorporates 
the post-epidemic analyses by international 
organizations and governments, such as the 
internal and external World Health Organi- 
zation reviews, and independent reports by 
the US National Academy of Medicine and 
other academic collaboratives. And it puts a 
human face on the challenges by telling the 
stories of researchers, clinicians and patients, 
including Sabeti and members of her lab. 

Today, every new outbreak demands the 
re-creation of administrative structures. 
Negotiations for access and operations are 
made in real time. Sabeti and Salahi argue 
that the dearth of dependable administra- 
tive structures and oversight in West Africa 
created a space in which competing incen- 
tives and personal ambitions wreaked 
havoc. They highlight the dark underbelly of 
on-the-ground interactions that slow down 
and ultimately stymie the response itself, and 
which many reports have previously shied 
away from. 

The authors interviewed numerous 
people and sent out an e-mail survey; 
132 people answered. This was anonymous, 
because otherwise most of us (I took the 
survey) would have baulked at answering 
the hard questions it featured about compe- 
tition, corruption and waste. About 27% of 
respondents said that they had experienced 
illegal tactics during the epidemic. Another 
37% noted that they had experienced intimi- 
dation, for instance by personally facing, 
witnessing, hearing about or perpetuating it. 

Sabeti and Salahi share stories of how 
data and samples that could have helped 
were hoarded by researchers and organiza- 
tions, including individuals within several 
international public-health bodies. They 
give an unflinching account of how corrup- 
tion at all levels siphoned resources away 
from communities in need. For example, 
the International Federation of Red Cross 
and Red Crescent Societies confirmed that 
almost US$6 million of its Ebola funds 
were misappropriated in West Africa. The 
authors also highlight concrete examples 
of how organizational politics, cultural 
differences and mistrust led to ineffective 
communication at crucial points during the 
epidemic. 

Despite these revelations, the book ends 
on a positive note. It links failures during 
the epidemic to specific underlying gaps in 
governance (for example, the ways in which 
aid is distributed and reported allowed for 
individuals to siphon off funds). It identi- 
fies core processes required for reform. 
These include streamlining communica- 
tion systems between outbreak-response 
partners, and establishing a central research- 
governance structure composed of experts 
without conflicts of interest. 

It would have been interesting to know 
what, if any, changes the organizations 
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named have made in the aftermath of the 
epidemic. And the authors could have 
invited the individuals singled out for bad 
behaviour to give their perspective, even if 
confidentially. 

Indeed, my favourite part of the book was 
the authors’ description of Sierra Leone’s 
Fambul Tok (Krio for ‘family talk’), a ver- 
sion of truth and reconciliation committees 
that operate at the village level. They pro- 
pose these as a best-practice example of how 
international organizations can elicit real- 
time feedback from outbreak responders. 

The power of Outbreak Culture is its uni- 
versality. It describes dynamics common at 
varying levels in every humanitarian emer- 
gency. But if this book is in any way about 
one person, it is about Sheik Humarr Khan. 
His story is woven through the authors’ 
arguments. Khan was born and grew up in 
Sierra Leone, and against immense odds 

he became a lead- 


“Where youare me acer: in 
born decides : eae : an 
whether or not imam Deas 
P was one of the 
you survive fi hiveiel 
Ebola.” irst physicians 


to care for people 
with Ebola in the 
country in the first half of the epidemic. In 
July 2014, he died of the disease. The book 
begins with the political and ethical issues 
surrounding his treatment. Where should 
he have been treated? Should he have been 
given experimental therapeutics? What 
about the failure to evacuate him (or delay 
in doing so) for more advanced supportive 
care, even as international responders from 
the West were offered this opportunity? 

My first deployment as physician in an 
Ebola treatment unit during the epidemic 
was in Kenema, Sierra Leone. I arrived 
about three weeks after Khan died; the 
health-care community there was still reel- 
ing from the loss. His story underscores one 
of the basic truths of the book: where you 
are born decides whether or not you sur- 
vive Ebola. From the comfort of a Western 
lab or clinic, it is easy to forget the differ- 
ences in quality of treatment (and a million 
other small injustices) for people in poorer 
communities. It is much harder to ignore 
those differences when you are working in 
the field, shoulder to shoulder with health- 
care providers who know that they will not 
be airlifted to better care if they get sick, yet 
have the unimaginable courage to carry on. 
Outbreak Culture is a much-needed call for 
greater justice next time. = 


Nahid Bhadelia is an infectious-diseases 
physician and medical director of the 
Special Pathogens Unit at Boston Medical 
Center and the National Emerging 
Infectious Diseases Laboratories in Boston, 
Massachusetts. 

e-mail: nbhadeli@bu.edu 


Books in brief 


American Overdose: The Opioid Tragedy in Three Acts 

Chris McGreal PUBLICAFFAIRS (2018) 

Since 1999, opioids have killed an estimated 350,000 people in the 
United States. In this powerful encapsulation of that epidemic, which 
grew nearly unchecked for two decades, journalist Chris McGreal 
argues that the culprits are three: a profit-driven medical system, 
pharmaceutical-industry greed and bad science. He traces the 
unfolding crisis through the emergence of drugs such as oxycodone; 
‘pill mill’ medical clinics that became addiction hotspots; and 
uneven regulation. As opioid use rises in Britain, parts of Africa and 
Australia, this is a timely examination of hard-won lessons. 


= The Library of Ice 
Nancy Campbell SCRIBNER (2018) 

wey In winter 2010, poet and writer Nancy Campbell journeyed to the 
1 IBRagy night-shrouded Arctic, where a residence at a Greenland museum 
of ICE launched this kaleidoscopic exploration of ice in science and culture. 

An intellectual omnivore, Campbell examines ice cores as archives in 

which researchers “read the alphabet of elements and isotopes”; and 
probes the weird dynamics of hail, proto-chemist Robert Boyle’s 1665 
New Experiments and Observations Touching Cold, curling rinks and the 
exploits of polar explorers from Knud Rasmussen to George Murray 
Levick. A marvellously subtle journey by way of flake, frost and berg. 


a 
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18 Miles: The Epic Drama of Our Atmosphere and Its Weather 
Christopher Dewdney ECW (2018) 

Our atmosphere may be just a few tens of kilometres deep, but in its 
vast amphitheatre, weather — the greatest show off Earth — struts 
its stuff. Poet and naturalist Christopher Dewdney’s grand tour 
mingles meteorology, planetary science and literature, taking us 
from oxygenation 2.5 billion years ago through atmospheric layers, 
precipitation, the architecture of wind and the work of scientists 
from Robert FitzRoy to Milutin Milankovic. Among many remarkable 
stories is that of US Marine William Rankin’s 1959 parachute drop 
into the wild, convulsed depths of an active cumulonimbus cloud. 


Searching for the Lost Tombs of Egypt 

Chris Naunton THAMES & HUDSON (2018) 

For all the astounding finds in Egypt over the past two centuries, its 
storied landscape is still riddled with ‘known unknowns’ — the lost or 
undiscovered tombs of ancient luminaries, tantalizingly mentioned in 
historical accounts. In this gorgeously illustrated study, Egyptologist 
Chris Naunton builds a case for the probable burial sites of several 
figures, from the brilliant third-millennium BC architect Imhotep to 
fourth-century BC empire builder Alexander the Great. Perhaps the 
most fascinating case is that of Cleopatra, ruler of Egypt from 51 to 
30 Bc, whose vast mausoleum may rest on the sea bed off Alexandria. 


Darwin’s Most Wonderful Plants 

Ken Thompson PROFILE (2018) 

In this quietly riveting study, plant biologist Ken Thompson 
reveals Charles Darwin as a botanical revolutionary through 
works such as On the Movement and Habits of Climbing Plants 
(1865), which remains pertinent. Interweaving current research 
with Darwin’s insights, Thompson probes marvels such as “ivy 
glue”, a nanocomposite that functions not unlike a gecko’s bristles 
in sticking stems to surfaces; the astonishing mimicry of the 
chameleon vine Boquila trifoliolata; and the Cook pine Araucaria 
columnarius, which always leans towards the equator. Barbara Kiser 
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Hoverflies’ large compound eyes give them a very large field of view. 


An eye on ocular wonders 


Todd Oakley enjoys a meander through vi 


he many eyes of scallops use 
reflecting mirrors, something like 
telescopes. Mantis shrimp have 
12 classes of colour receptor to our 3. These 
findings in vision science are just two of 
many milestones in neurobiologist Michael 
Land’s meander through topics in the field 


— from the optical physics of eye function 
to the neurobiology of how brains inter- 
pret optical phenomena. Eyes to See is the 
journey ofa scientist who followed his nose 
(and eyes) towards what fascinated him. 
The book is organized loosely around 
Land’s career. Opening with his 1960s 
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ion research in myriad organisms. 


discovery of scallop eyes’ bizarre reflecting 
mechanism, Land quickly transitions to the 
evolution of early eyes. Here, he lucidly 
encapsulates zoologist Dan-Eric Nilsson’s 
functional synthesis, which links demands 
such as efficiency of light capture to opti- 
cal innovations such as elaborations of cell 
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membranes. These innovations underlie 
four main stages of eye evolution: non- 
directional, directional, low-resolution 
and high-resolution vision. 

Land dives into history, such as the 
contributions of late-nineteenth-century 
scientist Sigmund Exner, along with fun 
anecdotes and clearly described optical 
principles. He relates how he and a col- 
league shot peppercorns, pieces of potato 
and tethered peas towards male hoverflies, 
which mistook the objects for females, 
while another colleague lay on his back 
to film the flies’ pursuit. Land describes 
the optics of arthropods’ compound eyes, 
which include apposition and superposi- 
tion eyes, both reflecting (co-discovered 
by Land) and refracting. Superposition eyes 
combine light from multiple facets onto a 
single receptor and are found in many 
nocturnal insects. 


UNDERWATER WONDERS 
To study the eyes of deep-sea animals, Land 
spent time on research vessels. He writes of 
camouflage. Silver-sided fishes, for instance, 
reflect the homogeneous horizontal scenes 
of the open sea to blend in. Bioluminescence, 
is often deployed to match downwelling light 
and avoid casting a shadow. Here in the deep, 
we also meet some other amazing creatures, 
including the hyperiid amphipods and the 
spookfish Dolichopteryx longipes, which 
have peculiar double eyes that look straight 
up and down to take advantage of the envi- 
ronment’s vertically oriented light. 

Even the most sophisticated eyes are of no 
use without a brain with which to interpret 
and use the information. In examining neu- 
robiology, Land first looks at how animals 
recognize things. He found, through experi- 
mentation with cotton reels and remote- 
controlled toy cars, that fiddler crabs use 
simple rules to detect potential mates or 
predators — anything seen to extend above 
the horizon is bigger than a crab and there- 
fore a foe. Thus, crabs are easily fooled, 
even by simple shapes shown to them by 
experimenters. Land also recounts how, 
in 1968, he developed an ophthalmoscope 
for tiny jumping spi- 
ders such as Phidip- 
pus johnsoni, to track 
their eye movements 
in an era before video. 
With this, he discov- 
ered something like a 
program of eye move- 
ments that the spiders 
use to detect potential 
mates, for example 
by looking for legs. 
The spider ophthal- 
moscope is one of 
the clearest accounts 
in the book of Land’s 
scientific ingenuity. 


Eyes to See: 

The Astonishing 
Variety of Vision in 
Nature 


MICHAEL F. LAND 
Oxford University Press 
(2018) 
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The eyes of scallops contain mirrored structures, a little like telescopes. 


Leaving invertebrates behind, Land 
moves on to humans. As he writes, we can 
communicate about what we see, mak- 
ing some aspects of vision — especially 
perception — easier to study in humans 
than in (say) a mantis shrimp. He has 
used eye-tracking tools to learn where 
people look when driving around a curve, 
reading words or music, or striking a 
cricket ball. This research expands into 
the realms of psychology — investigating 
mental images of colour and depth — and 
beyond, to philosophical territory such as 
consciousness. 


BEHIND THE LENS 

I had a few criticisms. First, instances of 
outdated scholarship pepper the book. 
One example is Land’s discussion of 
graded-index lenses in aquatic animals, 
which achieve high power without spheri- 
cal aberration, using a smooth transition 
from low to high refractive power. Land 
suggests that how the proteins of a graded 
lens are arranged remains a mystery. This 
omits high-profile work by biophysicist 
Alison Sweeney and her colleagues on 
squid lenses, which shows how proteins — 
duplicated and differently sized — pack to 
different densities to create a smooth tran- 
sition in refractive index (A. M. Sweeney 


et al. J. R. Soc. Interface 22, 685-698 (2007); 
J. Cai et al. Science 357, 564-569; 2017). 

Land’s organization of ideas often seems 
forced, as if he has tried to shoehorn 
research driven mainly by curiosity into 
synthetic constructs. For example, I found 
his adaptationist summary unsatisfying. He 
uses geneticist Theodosius Dobzhansky’s 
ubiquitous quote — “nothing in biology 
makes sense except in the light of evolu- 
tion” — to contend that evolution makes 
progress when needed. But evolution has 
not been the main thrust of Land’s research, 
and a view of it as an optimizing force does 
little to inform us about how a specific eye 
works or how a specific brain decodes visual 
information. 

Yet the positives of Eyes to See outweigh 
its negatives. It is enjoyable to get fre- 
quent glimpses of Land’s career, which has 
defined many of the most actively studied 
topics and organisms in visual ecology 
today. Ultimately, there is a coherence to 
the book’s seemingly stochastic journey 
through the workings of vision, as we 
proceed from photon to philosophy. = 


Todd Oakley is a professor of 
evolutionary biology at the University of 
California, Santa Barbara. 

e-mail: oakley@ucsb.edu 
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Nobel boost for 
basic research 


Ata time when curiosity-driven 
research struggles for respect, this 
year’s Nobel Prize in Physiology 
or Medicine provides a visceral 
example of its value. The prize 
was won by James Allison 

and Tasuku Honjo for their 
discoveries of the potentially 
immunotherapeutic molecules 
CTLA-4 and PD-1, respectively 
(see Nature 562, 20-21; 2018). 

In 1990, when Jim suggested 
that we work on the role of a 
newly identified member of the 
immunoglobulin superfamily, 
called CTLA-4, we agreed. 
simply because it was similar in 
sequence to another molecule, 
CD28, on the surface of mouse 
immune cells (T cells). We 
were investigating things about 
which we knew very little, with 
no inkling that the work would 
spawn a whole industry of life- 
saving drugs for tens of thousands 
of people with cancer. 

Within 5 years of my start, we 
found that antibodies against 
CTLA-4 could harness T cells 
to destroy tumours in mice. 
Although extraordinary levels 
of current funding for variations 
on the checkpoint method (new 
molecules and reagents) and 
for T-cell therapies in general 
will yield more cures, the next 
generation of advances is likely 
to come from curiosity-based 
studies. These, too, will need 
substantial funding. 

Matthew Krummel University 
of California, San Francisco, USA. 
matthew.krummel@ucsf.edu 


Mentorship training 
will curb bullying 


Academic mentorship is not 
synonymous with supervision. 
It is the nurturing of researchers’ 
scientific and professional 
growth. Cultures that explicitly 
value and reward mentoring 
make it clear that bullying has no 
place in research (see also S. Moss 
Nature 560, 529; 2018). 
Crucially, mentorship also 
promotes constructive scientific 


dialogue between mentors and 
trainees. Just like research and 
teaching, however, mentoring 
philosophy and practice 

cannot be learnt in standalone 
workshops. It must be continually 
refined and improved through 
feedback and institutional 
support. Proven initiatives at my 
own institution, for example, 
include encouraging mentors 

to exchange experiences and to 
share best practices. 

Susan E. Liao Johns Hopkins 
University, Baltimore, Maryland, 
USA. 

seliao@jhmi.edu 


China tracks its 
progress on SDGs 


China's progress in meeting 
the United Nations Sustainable 
Development Goals (SDGs) is 
being successfully monitored 
using geospatial and statistical 
information in a pilot scheme 
running in Deqing county, 
Zhejiang province. 

A team of 20 researchers, led by 
the National Geomatics Center 
of China, measured 100 SDG 
indicators over the 938-square- 
kilometre county. In line with 
the UN Global SDG Indicator 
Framework, multi-scale and 
multi-type geospatial and 
statistical data were integrated 
for comprehensive measurement 
and evidence-based progress 
analysis. These data included 
topographic and land-cover 
maps, aerial and satellite images, 
disaggregated socio-economic 
information and environment 
statistics, as well as some from 
social media. 

The conclusion is that the 
county, which has a population 
of around 430,000, has made 
significant economic and social 
advances and maintained a good 
ecological environment over the 
past 5 years. Challenges such as 
inadequate public transport in 
some regions have been drawn to 
the attention of policymakers. 

An online public information 
service charts Deqing’s progress 
towards achieving the SDGs. The 
pilot scheme’s findings will be 
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discussed at the UN’s first World 
Geospatial Information Congress 
later this month. 

Jun Chen National Geomatics 
Center of China, Beijing, China. 
Zhilin Li Hong Kong Polytechnic 
University, Hung Hom, 

Hong Kong. 

chenjun@nsdi.gov.cn 


Tap into the joy of 
open-access data 


Asan underfunded clinical 
researcher working in Jordan, 
Iam limited to pursuing 
inexpensive observational 
studies that are based on patients’ 
records. Happily, open-access 
data from repositories around the 
world have enabled me to make a 
bigger contribution to science. 

My best experience was with 
the Biologic Specimen and 
Data Repository Information 
Coordinating Center, which 
curates data from large studies 
funded by the National Institutes 
of Health. I was able to access 
rigorous, high-quality data from 
almost 1,200 people with an 
inflammatory disease known as 
sarcoidosis, along with a control 
group. In Jordan, it would take 
me until I retired to generate this 
much data first-hand. 

Our results will be published 
in the journal that hosted the 
original data. We completed 
two more studies on the same 
data set within six months (see 
S.A. AlRyalat et al. Curr. Respir. 
Med. Rev. 13, 241-246; 2017). 
Saif Aldeen AlRyalat University 
of Jordan, Amman, Jordan. 
saifryalat@yahoo.com 


Don’t dismiss 
Myers-Briggs 


In his review of Merve Emre’s 
book The Personality Brokers: 
The Strange History of Myers- 
Briggs and the Birth of Personality 
Testing, S. Alexander Haslam 
repeats the contention that the 
Myers-Briggs Type Indicator 
(MBTI) fails to “measure what 

it purports to measure” and 

to “elicit consistent responses 


across testing contexts” and that 

it “has low validity” (Nature 

561, 176; 2018). As an author 

of works on this topic, I have 

presented considerable evidence 

to the contrary (see, for example, 

P. Moyle and J. Hackston J. Pers. 

Assess. 100, 507-517; 2018). 
Before dismissing the MBTI 

as lacking in reliability and 

validity, evaluators would do 

well to consult this and other 

peer-reviewed papers (see, 

for example, T. Sitzmann et 

al. J. Appl. Psychol. https://doi. 

org/10.1037/apl0000352; 2018). 

They can then make an informed 

judgement. 

Penny Moyle Oxford, UK. 

pennymoyle@hotmail.com 


Mekong’s dams 
damn fisheries 


One of the world’s largest inland 
fisheries is under threat from 
overfishing, dams and habitat 
fragmentation. The Tonlé Sap 
Lake in the Mekong River Basin 
now yields stable harvests of 
only very small fish species. 
Stakeholders, government and 
developers must put conservation 
and mitigation measures in place 
before it’s too late. 

Hydropower construction 
is proliferating in the Lower 
Mekong Basin, disrupting 
natural seasonal river pulses and 
blocking the migration routes of 
riverine fishes. An estimated 60% 
of the catch in the Lower Mekong 
Basin is made up of migratory 
fish (G. Vaidyanathan Nature 
478, 305-307; 2011). People 
who live along the Mekong River 
and on the floodplain will be 
particularly affected because 
the local fish they consume 
also rely on migratory riverine 
species for prey. 

Without urgent action, the 
outlook is bleak for this once- 
sustainable fishery. 

Peng Bun Ngor* Fisheries 
Administration, Phnom Penh, 
Cambodia. 

*On behalf of 4 correspondents 
(see go.nature.com/2shx1kz for 
fulllist). 


pengbun.ngor@gmail.com 
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Leon Lederman 


(1922-2018) 


Physicist who expanded the family tree of fundamental particles. 


eon Max Lederman’s stunning 
Liseerss leadership and advo- 

cacy laid the foundations for particle 
physics today. His discovery of the muon 
neutrino established that there was more 
than one type of neutrino. His observation 
of muon decay knocked down a pillar theory 
about symmetry in one of the fundamen- 
tal forces. His discovery of the long-lived 
neutral kaon meson helped to home in on 
one of the great mysteries of physics. And 
his discovery of bottom quarks — subatomic 
particles that make up neutrons and protons 
— led researchers to uncover a third family 
of quarks. 

Self-effacing, approachable and imagina- 
tive, Lederman was a consummate joke-teller. 
“Physics is not religion,’ he used to quip. “Ifit 
were, we would have a much easier time rais- 
ing money.’ He had good taste in research 
problems and a gift for recognizing con- 
nections and opportunities. He was also a 
charismatic communicator; in later years, he 
focused on advancing science education. He 
died on 3 October in Rexburg, Idaho, aged 96. 

Lederman was born on 15 July 1922 in 
New York City to Jewish parents who had 
emigrated from Russia and who were keen 
on education. Lederman grew up at a time 
when Jewish scientists were fleeing immi- 
nent war in Europe for the United States, and 
he was attracted to the excitement surround- 
ing the twentieth-century physics revolution 
of which many scientists were a part. 

He did his bachelor’s degree in chemistry at 
the City College of New York in 1943. After 
three years of US Army service in Europe, he 
earned his PhD in physics from Columbia 
University in New York City in 1951. 

There, he spent three decades teaching and 
conducting experimental research as a faculty 
member. Nature was ripe for discovery, and 
clever experiments with particles accelerated 
to the highest energies were expected to yield 
the biggest discoveries. 

From his mentor at Columbia, Nobel- 
prizewinning physicist Isidor Isaac Rabi, 
Lederman learnt to distinguish observation 
(a hint of something new) from measure- 
ment (a more precise endeavour). Lederman 
became the master of both, with the crea- 
tivity to devise unique experiments and the 
tenacity to follow through on them. 

Lederman’s most famous work was done 
in 1962 at Brookhaven National Labora- 
tory in Long Island, New York, with two of 
his Columbia University colleagues, Jack 
Steinberger and Mel Schwartz. In what would 


today be called a ‘beam dump’ experiment, 
the researchers aimed a powerful proton 
beam at a target, producing an abundance 
of every type of known particle. These were 
absorbed by a wall of dense material. 

The experimenters examined the debris to 
see whether anything interesting emerged. It 
did: the muon neutrino, the second neutrino 
family to be discovered. The surprising dis- 
covery indicated that fundamental particles 
come in pairs, and advanced the idea that 
symmetry is intrinsic to nature’s building 
blocks. For this work, the team shared the 
1988 Nobel Prize in Physics. 

Lederman conducted experiments 
at high-energy accelerators around the 
world, and contributed to the founding of 
the 200-gigaelectronvolt Fermi National 
Accelerator Laboratory (now Fermilab) in 
Batavia, Illinois. 

Indeed, Lederman made Fermilab’s first 
major discovery, the bottom quark, in 1977. 
In an elegant experiment, his team looked 
for particles so rare that they would result 
from maybe one in a hundred-trillion colli- 
sions between an intense proton beam anda 
target. Lederman described the new, more- 
sensitive set-up as being as enlightening as 
the first telescope. They sawa signal coming 
from a bottom-antibottom quark pair. They 
named the signal upsilon, after the shape of 
the decay-particle trajectories, which resem- 
bled the Greek letter. 

Lederman was the second director of 
Fermilab from 1979 to 1989. He led the 
rigorous design and early operation of the 
Tevatron Collider (1983-2011), which 
was the highest-energy proton—antiproton 
collider in the world for nearly three decades. 


From 1982, Lederman championed the 
most ambitious accelerator project ever — 
the Superconducting Super Collider. But in 
1993, with construction already under way, 
Congress cancelled the project during tight- 
budget years. The torch was passed to the 
Large Hadron Collider at CERN, Europe's 
particle-physics laboratory near Geneva, 
Switzerland. Here, the Higgs boson was 
discovered — its popular name in the press, 
the ‘God Particle? was taken from the 1993 
popular-science book Lederman co-wrote 
with Dick Teresi, The God Particle: If the 
Universe is the Answer, What is the Question? 

Lederman leaves a lasting educational 
legacy. As director of Fermilab, he intro- 
duced Saturday Morning Physics, a ten-week 
physics class for high-school students, which 
is still popular more than three decades on. 
He started the Friends of Fermilab, a school 
outreach effort that grew into the Lederman 
Science Center at Fermilab. He founded the 
Illinois Mathematics and Science Academy, 
a residential, state-supported high school, 
and he championed the ‘Physics First’ high- 
school science curriculum, which teaches 
foundational physics before chemistry and 
biology. 

His scientific legacy continues with 
ongoing efforts to explore the particles that 
he discovered. The neutrino, perhaps the 
most ubiquitous particle in the Universe, is 
still one of the least understood. Its tiny mass 
and quirky interactions remain puzzles. 
Alongside enormous neutrino detectors 
around the world, Fermilab is leading an 
ambitious global effort to study muon neutri- 
nos beamed from Chicago to South Dakota, 
to understand how they oscillate from one 
flavour to another. 

Several nations built dedicated accelera- 
tors to explore the bottom quark in detail. 
This year, Japan commissioned an ambitious 
new one at the KEK laboratory in Tsukuba, 
for example. Measurements of rare, bottom- 
quark decays seem to be harbingers of new 
physics beyond the standard model. 

Lederman, among others, is also credited 
with merging the sciences of the very small 
and the very big — particle physics and 
cosmology. Since Lederman’s heyday, the 
study of subatomic particles has been used 
to probe the early Universe and its most 
energetic phenomena. = 


Nigel S. Lockyer is the director of Fermilab 
in Batavia, Illinois. 
e-mail: lockyer@fnal.gov 
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Living systems engineered 


Engineering approaches allow biological structures and behaviours to be reconstituted in vitro. A biologist and a physicist 
discuss the potential and limitations of this bottom-up philosophy in providing insights into complex biological processes. 


THE TOPIC IN BRIEF 

@ In bottom-up approaches, cellular 
structures and behaviours are reconstituted 
from their constituent parts. 

@ This strategy reduces a complex, 

living system to a more manageable 

set of component parts, defined by the 
researcher. 

@ Bottom-up experiments have provided 
insights into processes such as cell division, 


Understanding 
by building 


he complexity of living cells is staggering 

— their cytoplasm contains tens of thou- 
sands of distinct macromolecules and metab- 
olites that must be coordinated to interact in 
time and space. This intricacy often makes it 
difficult to interpret results from conventional 
‘top-down’ studies, in which individual com- 
ponents are removed or modified. Bottom- 
up approaches complement investigations of 
living cells, make it easier to define the rules 
that govern biological organization, and 
have provided insights into many previously 
intractable biological problems. 

Biochemical reconstitution has convention- 
ally been used to identify the minimal set of 
purified factors needed to recapitulate a given 
cellular activity in vitro. Today, spurred on by 
advances in materials and engineering, we 
can carry out more-ambitious cellular recon- 
stitutions. These experiments combine bio- 
chemical reconstitution with defined spatial 
boundaries that can be generated in a range 
of ways, including through micropatterning 
and microfabrication of surfaces, and through 
microfluidic techniques that create compart- 
ments surrounded by membranes made up 
of lipid bilayers or monolayers. The bounda- 
ries can be set to either mimic or perturb the 
natural organization ofa cell. 

Unexpected properties and activities have 
emerged from the mixing of boundaries 
and biochemical reactions, leading to new 


chromosome packaging and tissue 
patterning (Fig. 1). 

@ Some researchers believe that any cellular 
behaviour could eventually be modelled 
from the bottom up. 

@ Others, however, argue that this strategy is 
insufficient for understanding more-complex 
biological functions that bridge scales of 
complexity — for example, those in which 
individual cells act collectively. 


mechanistic insights and extending our abil- 
ity to model biological processes. For instance, 
it has been found that encapsulating the motor 
protein kinesin with filaments called micro- 
tubules produces a force-generating network 
that can propel the movement of cell-size 
capsules’. This experiment acts as a proof of 
concept that simplified systems can gener- 
ate force, and motivates researchers to look 
for analogous systems in nature, rather than 
assuming that the solution to this problem in 

living organisms must always be complex. 
As another example, changes in the size 
and shape of encapsulated compartments — 
manipulations that 


“Bottom-up are easy in micro- 


reconstitution eh mies d en 
ut not in livin 
Lear . cells — have reve aled 
expand our roles for these para- 
understanding meters in control- 
° ling the oscillatory 
of biology at 


behaviour of proteins 
involved in cell divi- 
sion’ and the sizes of 
organelles’. These discoveries act as a potent 
reminder that experiments carried out in test 
tubes involve much greater volumes than 
that of a cell; an absence of boundary condi- 
tions might obfuscate underlying biological 
principles, such as size scaling. 

In some cases, biological insight can 
be gained only by removing boundaries, 
to simplify the system or enable it to be 


BOTTOM-UP BIOLOGY 


A Nature special issue 
go.nature.com/bottomupbiology 


higher levels.” 
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expanded — another feat not possible in 
living cells. For example, cytoplasmic extracts 
from cells have been used to show that chro- 
mosomes can condense in preparation for cell 
division even in the absence of histone pro- 
teins, around which DNA is packaged in cells’. 
Because histones are essential to living cells, 
such experiments could be performed only 
ex vivo. 

The same cytoplasmic-extract system has 
also been combined with an elongated cham- 
ber (many times longer than a cell) to identify 
a new class of signalling reaction that spatially 
coordinates the cell cycle and is based on self- 
propagating ‘trigger waves’. These waves, first 
postulated by mathematical modelling, could 
be identified and characterized more easily in 
a deconstructed system containing defined 
boundary conditions than by in vivo methods. 

For bottom-up approaches to further 
complement and one day even surpass top- 
down approaches, certain challenges must be 
addressed. One is deciding which elements 
of a cell can be removed without limiting the 
biological relevance of the findings. Another 
is to identify empirical approaches to validate 
whether rules derived from in vitro experi- 
ments are predictive for in vivo situations. 
A third is that we must continue to improve 
the precision and reliability of engineered 
boundaries to better mimic those in cells. 

Bottom-up reconstitution approaches 
promise to expand our understanding of biol- 
ogy at higher levels. By continuing to increase 
the number of components that can be pat- 
terned together, it might eventually be possible 
to reconstruct systems that rival the complex- 
ity of living cells and tissues. For example, 
micropatterning techniques and assembly 
principles are being used to configure cells in 
geometries that mimic those found in embry- 
onic development’, and to produce designer 
tissues’. In these experiments, complex, con- 
trollable 3D tissue shapes can be obtained by 
defining simple interactions between indi- 
vidual cells, between cells and the structural 
matrix that surrounds them and between pro- 
teins involved in tissue patterning. Given well- 
characterized building blocks, a preliminary 
understanding of boundary conditions and a 
reasonable period of time, it should be realis- 
tic to consider reconstituting any complex cell 
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Figure 1 | Increasing levels of complexity modelled from the bottom up. a, In in vitro reconstitution, 
biological parts and processes can be recreated from a minimal set of components. For example, by 
enclosing individual molecules within specific boundaries, the minimal set of proteins and interactions 
needed for the emergence of complex phenomena such as cell division can be defined. Likewise, spatial 
confinement of cells has provided insights into tissue patterning in embryos. b, However, there is debate 
about whether these bottom-up approaches can provide mechanistic insight into biological processes that 
bridge scales of complexity — the processes by which thousands of individual molecules interact in cells, 
for instance, or by which cells act collectively to switch from a fluid-like to a solid-like state. 


behaviour. From there, a new understanding 
of the rules that underlie biological processes 
can emerge. 


Matthew Good is in the Departments of 
Cell and Developmental Biology and of 
Bioengineering, University of Pennsylvania, 
Philadelphia, Pennsylvania 19104, USA. 
e-mail: mattgood@pennmedicine.upenn.edu 


Bottom does not 
explain top 


| Badenty yourself as an engineer in the car 
industry. You know the role of every bolt, 
joint and circuit in a car. You have great under- 
standing of the hierarchies and redundancies 
that ensure smooth functioning of all systems. 
Ina nutshell, you know how to build a car from 
the bottom up. Now imagine that you are asked 
to solve the problem of traffic jams during rush 
hour. Traffic jams are a problem of cars, and 
you know everything about cars — but this 
detailed knowledge is irrelevant to under- 
standing why they jam. Similarly, an under- 
standing of how complex biological structures 
or even whole cells are built can provide only 
a certain level of understanding about how 
biological systems function at higher levels of 
organization. 

Much like cars in traffic, cells can jam. 
At low density, cells grown in culture move 
and exchange neighbours frequently, like 


molecules in a fluid. But at higher densities, 
cellular movements slow down, rearrange- 
ments vanish, and the system jams into a 
solid-like state*. Other variables can also 
cause cell jamming — when cultured at the 
same density, epithelial cells obtained from 
people with asthma move collectively, like a 
fluid, whereas their healthy counterparts jam’. 
Cell jamming also occurs in vivo, and is key to 
the normal elongation of zebrafish embryos 
during development”’. 

Jamming is a prominent example of a meso- 
scale phenomenon — a process that operates 
at a longer scale than do the elementary com- 
ponents of a system — in which ‘bottom does 
not explain ‘top: It is unlikely that being able 
to reconstitute the differing protein pathways 
in healthy and asthmatic cells will explain why 
one jams more easily than the other. In fact, the 
most predictive variable for cell jamming is the 
shape index, a geometric quantity defined as 
the ratio of the cell’s perimeter to the square 
root of its area’. Similarly, it is also unlikely that 
the genetic programs that are turned on and off 
during development will explain why cells jam 
as the vertebrate body elongates. 

Another example of a process in which 
mechanisms at the bottom cannot explain phe- 
nomena at the top is collective gradient sens- 
ing, whereby a group can sense and respond 
to an environmental gradient but individual 
constitutive elements cannot. Collective gra- 
dient sensing has been observed in a variety 
of groups undergoing directed migration, 
including shoals of fish moving in response to 
light"', clusters of leukaemia cells responding 
to chemical signals”, and epithelial cells whose 
migration is guided by a gradient of rigidity”. 
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The reconstitution of each element of these 
motile groups, no matter how detailed, will 
never explain why groups move in the direction 
of the gradient but individual elements do not. 

This is not to say, of course, that reconsti- 
tution of molecular-scale processes is not 
useful. Indeed, the engineering of genetic 
circuits responsible for cell communication is 
sufficient to control 3D tissue shape’*. How- 
ever, principles at the molecular scale cannot 
generally explain functions at a higher level of 
organization. 

Well before its discovery in cells, jamming 
was proposed to tie together liquid-to-solid 
transitions in a wide range of inert materials 
such as foams, emulsions and sand piles’’. The 
materials differ broadly in composition, but 
their physical behaviour can be captured bya 
simple set of physical variables. These variables 
configure a phase diagram — a graph that plots 
whether the system is jammed or unjammed 
as a function of interaction forces, temperature 
or density. Like jamming in inert’ and living 
matter””’, other biological functions might be 
explained in terms of phase diagrams. Given 
the complexity of cells and tissues, the posi- 
tion of a system in those diagrams will be 
compatible with many different combina- 
tions of molecular states, concentrations and 
interactions. 

It is questionable whether trying to 
understand each of these combinations is an 
efficient path towards a predictive under- 
standing of complex living systems. Rather, 
we should aim to identify the mesoscale prin- 
ciples and variables that ultimately determine 
how these systems work. To do so, we need to 
develop experimental approaches to probe 
tissues at multiple length scales and timescales 
through both mechanical and biochemical 
manipulation. = 


Xavier Trepat is at the Institute for 
Bioengineering of Catalonia, Barcelona 
Institute for Science and Technology, ICREA 
and CIBER-BBN, 08028 Barcelona, Spain. 
e-mail: xtrepat@ibecbarcelona.eu 
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Proposed early signs of 
life not set in stone 


Efforts to find early traces of life on Earth often focus on structures in ancient 
rocks, called stromatolites, that formed by microbial activity. One of the oldest 
proposed stromatolite discoveries has now been questioned. SEE LETTER P.241 


MARK A. VAN ZUILEN 


discovery of cone-shaped structures in 
3.7-billion-year-old rocks in the Isua 
Supracrustal Belt, Greenland, that they 
identified as being stromatolites — struc- 
tures that arise as a result of the presence of 
water-dwelling microorganisms. Previously, 
the earliest known stromatolites were reported 
to be those in 3.45-billion-year-old rocks in 
Australia’. Being able to accurately date the 
first signs of the emergence of life has impor- 
tant implications for understanding how life 
on Earth evolved. However, on page 241, 
Allwood et al.’ now report their own inde- 
pendent analysis of these ancient rocks in 
Greenland, and argue that, in this particular 
case, the structures that Nutman and col- 
leagues interpreted to be stromatolites instead 
arose by non-biological processes. This find- 
ing shows that a natural process that does not 
require any input from a living organism can 
mimic the formation of a structure that nor- 
mally counts as a strong indication of previous 
biological activity. 
Stromatolites have a laminated (layered) 


I: 2016, Nutman et al.' reported the 


Ancient rock Laminae Stromatolite 


sample 


Figure 1 | Layered structures in ancient rocks. a, Conical structures that 
have internal layers (laminae) and are found in ancient rocks have been 
identified as a type of stromatolite structure — specifically, a stromatolite 

that forms as the result of the action of water-dwelling microorganisms. Such 
stromatolites, which typically have a size on a centimetre scale in these ancient 
rocks, have been cited as providing early evidence of life on Earth. However, 
the positive identification of stromatolites can be controversial, given that 


structure (Fig. la), formed by sediment 
trapping, binding and mineral deposition 
within microbial communities*. They can 
form in a range of shapes: conical, columnar 
or dome-like. Whether microorganisms have 
a role in the formation of certain types of stro- 
matolite shape is unclear. There are models 
for how stromatolites can arise without input 
froma living organism’, and various laminated 
structures that occur naturally without requir- 
ing any biological activity can be mistaken for 
stromatolites, such as silica deposits around 
geysers* or laminated carbonate crusts that 
form when water evaporates’. In well-preserved 
stromatolite specimens, a biological contribu- 
tion to such structures can often be confirmed 
by the presence of complex branching, intri- 
cate laminar textures, cavities or, in some rare 
instances, preserved microfossils and moulds’”. 

Conical stromatolites are a special case, how- 
ever, because their shape alone can be sufficient 
to identify them as arising from biological 
processes. Their steep laminar slopes can- 
not arise from non-biological processes such 
as sedimentation or mineral precipitation. 
From the analysis of present-day stromatolites 
and laboratory experiments, it is known that 
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Extension 


conical stromatolites are the preserved remains 
of motile microbial communities that form 
vertical cones’, and that this cone structure 
can be preserved by the trapping, binding and 
precipitation of non-biological material. 

When stromatolite structures in the early 
rock record (which often have a centimetre- 
scale size) are analysed, their intricate lamina- 
tions, textures and composition have usually 
already been partially or completely destroyed 
through a process called metamorphism, in 
which rock structure is substantially altered 
and deformed by heat and pressure, often 
when the rock is buried deep underground. 
Stromatolite shape therefore becomes the 
main way to identify signs of biological input 
in ancient stromatolite-like structures. In the 
strongly metamorphosed Early Archean rock 
record (formed around 3.2 billion to 4 billion 
years ago), the identification of stromatolites 
arising from biological processes thus becomes 
particularly difficult. 

However, a convincing case was made 
for the presence of such biologically arising 
stromatolites in 3.45-billion-year-old rocks in 
Australia’. In addition to conical stromatolites, 
six other stromatolite shapes were found in the 
samples there; they all existed in specific parts 
of what was considered to be an ancient, shal- 
low, marine, carbonate-rich environment. This 
diversity in stromatolite shape convincingly 
excluded a uniform non-biological formation 
process and suggested that ecological controls 
governed the overall stromatolite growth. 
Evidence of such a clearly defined ancient 
environmental setting is difficult to find in any 
older metamorphosed rock on Earth. 

Nutman and colleagues reported the 
identification of ancient stromatolites in 
a newly described rock outcrop in Green- 
land, and also interpreted these structures as 


\ 


Eye sescion 


ancient rocks have been subject to deformations over time. b, Processes 

of rock extension and compression might create cone-like structures that 

look like stromatolites, and the deformation and replacement of layers of 
sedimentary rock might generate structures that look similar to stromatolite 
laminae that arise from biological activity. Allwood et al.’ argue that structures 
previously identified’ as stromatolites in 3.7-billion-year-old rocks in 
Greenland might have formed through such processes. 


having arisen in an ancient, shallow, marine 
environment, on the basis of the textures of 
interlayered sediments and the distribution 
patterns of rare-earth elements. Such patterns 
have previously been interpreted to indicate 
the deposition of carbonate minerals from 
seawater’. The entire region in which these 
rocks are located was previously found to be 
metamorphosed rock that had been subjected 
to high temperature and pressure”’. In the Aus- 
tralian rocks with ancient stromatolites’, lam- 
inations are clearly visible; in the Greenland 
samples, however, the proposed laminations 
are less clear, and the degree of metamorphism 
is higher than that of the Australian rocks. 

The lack of unambiguous, well-preserved 
laminated structures would preclude the 
identification of any intricate original tex- 
tures that might indicate biological input to 
the structure. However, Nutman et al. iden- 
tified remnant laminations and conical stro- 
matolite-like shapes that they consistently 
interpreted as being microbially generated 
structures. Apart from these conical shapes, 
Nutman and co-workers also identified some 
dome-like shapes of proposed stromatolites. 
However, they did not find the diversity of 
stromatolite forms described in the Austral- 
ian study. With few specimens, and a complex 
history of rock metamorphism, this raised the 
question of whether non-biological processes 
might have generated the dome-like and coni- 
cal shapes in these ancient Greenland rocks. 

Allwood et al. argue that the stromatolite- 
like shapes observed at the Greenland site arise 
from rock deformation. When they compared 
the front and side profiles of rock samples that 
contained stromatolite-like structures, they 
noted that one side shows a compressional 
deformation whereas the other shows an 
extensional deformation. This indicates that 
the structures are not stromatolite cones, but 
elongated ridges (Fig. 1b). Furthermore, the 
folding direction of the stromatolite ridges is 
parallel to the orientation of pressure-induced 
mineral textures on smaller scales in the same 
rock. These observations provide strong evi- 
dence for physical rock deformation and there- 
fore offer a non-biological explanation for the 
observed structures. 

In addition, Allwood and colleagues argue 
that the rock itself did not form ina shallow 
marine setting, but instead arose when carbon- 
ate minerals crystallized from fluids that circu- 
lated through an existing rock. If this is true, 
the observed dome-like and conical structures 
are definitely not stromatolites. Allwood et al. 
used a trace-element analysis technique that 
has high spatial resolution to show that the 
internal laminations in the conical structures 
represent the specific replacement of a type of 
silicate rock by fluid-derived carbonate min- 
erals. The authors found that the rare-earth- 
element signal associated with the presence of 
seawater seems to be mainly concentrated in 
mica minerals in the rock, but is also present in 
the carbonate areas. Allwood and co-workers 


suggest that this is possible if the fluids from 
which the minerals crystallized during later 
stages of the rock’s existence ultimately derived 
from seawater as well. So although Nutman et 
al.‘ and Allwood et al.’ report similar patterns 
of rare-earth elements in the rocks, they 
offer diverging interpretations of what these 
patterns mean. This highlights the complexi- 
ties in discerning primary chemical signatures 
in such highly deformed rocks. 

The biological input to ancient stromato- 
lites is a long-standing controversy. The rocky 
outcrop on Greenland has not been discovered 
for long, and few researchers have studied this 
rock in relation to its geological surroundings. 
Future research might lead to a firm under- 
standing of the primary versus secondary pro- 
cesses that shaped this rock. Clearly, the work of 
both Nutman et al. and Allwood et al. will form 
the basis for the interpretation of other possible 
stromatolites in the ancient rock record. m 
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Quenching our thirst 
for universality 


Understanding the dynamics of quantum systems far from equilibrium is one of 
the most pressing issues in physics. Three experiments based on ultracold atomic 
systems provide a major step forward. SEE LETTERS P.217, P.221 & p.225 


MICHAEL KOLODRUBETZ 


Ithough we live in a world of constant 

motion, physicists have focused 

largely on systems in or near equi- 
librium. In the past few decades, interest 
in non-equilibrium systems has increased, 
spurred by developments that are taking quan- 
tum mechanics from fundamental science to 
practical technology. Physicists are therefore 
tasked with an important question: what organ- 
izing principles do non-equilibrium quantum 
systems obey? On pages 217, 221 and 225, 
respectively, Priifer et al.', Eigen et al.’ and Erne 
et al.’ report experiments that provide a partial 
answer to this question. The studies show, for 
the first time, that ultracold atomic systems far 
from equilibrium exhibit universality, in which 
measurable experimental properties become 
independent of microscopic details. 

The researchers use low-density gases of 
rubidium’ or potassium’ atoms that are 
cooled to temperatures close to absolute 
zero. At sufficiently low temperatures, these 
atoms begin to show quantum-mechanical 
behaviour, forming a macroscopic quantum 
state known as a Bose-Einstein condensate. 


Starting from either such a condensate’ or 
an uncondensed gas’, the researchers rapidly 
change experimental parameters — a process 
known as a quench. Rather like a cartoon 
character that looks down to discover they 
have accidentally run off a cliff, the quench 
initiates far-from-equilibrium dynamics. 

Such quenches are relatively easy to real- 
ize, but what the researchers see next is sur- 
prising. Consider all the variables that can be 
associated with a given experiment: power 
fluctuations of lasers, variations in the lab’s 
temperature, microscopic details of atomic 
interactions, and so on. The researchers 
find that the dynamics of their experiments, 
despite involving strongly interacting atoms 
far from equilibrium, become independent of 
these variables. 

Eigen et al. accomplish this universality by 
carefully eliminating all but two of the vari- 
ables in their experiment: the density of the 
atomic gas and the scattering length. The latter 
describes how closely two atoms can pass with- 
out interacting. The authors then go one step 
further and eliminate the dependence of the 
scattering length on variables in a clever way. 

First, to prepare the initial condensate, the 
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authors set the scattering length to zero — they 
‘turn off’ the interactions — using a magnetic 
field*, Second, they quench the scattering 
length to infinity, again using the magnetic 
field. If we consider increasing the density 
of the gas by, for example, a factor of eight, 
the spacing between the atoms decreases by 
a factor of two. Zooming in (rescaling) by this 
factor of two, the atomic system looks exactly 
the same as it did before the density was 
increased, because the scattering lengths of 
zero and infinity are unchanged. 

Eigen and colleagues vary the density of the 
gas by a factor of about ten, and observe that 
the experimental dynamics are independent 
of the density after rescaling both space and 
time. They also adjust the temperature of the 
gas and show that universality holds when 
one more variable is considered — namely, 
the length scale on which the gas exhibits 
quantum-mechanical behaviour. 

Priifer et al. and Erne et al. uncover a 
different form of universality. On the face of 
it, the experiments of these two groups are 
wildly different. Erne and colleagues start 
with a three-dimensional gas, quench to one 
dimension, and observe the density of the gas 
as a function of position and time. Priifer and 
colleagues work in one dimension throughout, 
explore the internal states (spins) of the atoms 
and carry out a quench that allows these spins 
to fluctuate. But, after a short time, both groups 
observe universality, which they argue results 
from a phenomenon called a non-thermal 
fixed point. 

For systems in equilibrium, the concept of 
a fixed point comes from one of the great dis- 
coveries of twentieth-century physics, known 
as the renormalization group. This framework 
studies how a system evolves as we zoom out 
from the microscopic to the macroscopic scale, 
and successfully describes the emergence of 
key phases of matter such as magnetism. 
Fixed points are states of a system that remain 
unchanged on zooming out. Non-thermal fixed 
points occur when non-equilibrium systems 
approach such a state, with the role of zooming 
out played by the passage of time’. 

A classic example of a non-thermal fixed 
point is wave turbulence, in which the energy 
of waves is transferred from large to small 
scales. Priifer et al. and Erne et al. demon- 
strate the first examples of universality caused 
by non-thermal fixed points in systems domi- 
nated by quantum mechanics. Like Eigen and 
colleagues, the groups show that their results 
are robust by widely varying the initial condi- 
tions of their experiments and observing that 
the dynamics are effectively unchanged. 

Although Priifer et al. and Erne et al. use 
different quenches and measure different 
properties, their results are remarkably simi- 
lar. This resemblance provides perhaps the 
best evidence for the existence of universality 
in these atomic systems. At a technical level, 
the experiments do differ in their critical expo- 
nents (numbers that describe the properties 


of fixed points), which indicates that the two 
fixed points are different. 

Together, these three studies provide a 
substantial step forward in our understand- 
ing of quantum systems far from equilibrium. 
However, a complete picture of the under- 
lying universality remains to be determined. 
A notable concern for all of the experiments is 
that the universality occurs over limited time 
and length scales. Longer times, in particular, 
would probably be required to realize non- 
equilibrium steady states that are useful for 
practical applications. By analogy with wave 
turbulence, one possibility for extending the 
reach of the universality could involve con- 
tinuously pumping energy into the systems; it 
is well documented that universality is, at best, 
transient in the absence of an external drive. 

From a fundamental perspective, these 
experiments pave the way for exploring a wide 
range of theoretical and experimental ques- 
tions regarding non-equilibrium universality. 


BIOPHYSICS 


For example, what are the possible classes of 
non-thermal fixed points? What happens at 
extremely high or low energy scales, at which 
the universality breaks down? And under what 
conditions does universality arise in generic 
quenched systems? These are challenging ques- 
tions to answer, but I, for one, hope that these 
experiments open the door to placing non- 
equilibrium quantum systems alongside equi- 
librium ones in the lexicon of modern physics. m 
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Cellular stretch reveals 
superelastic powers 


External forces can make cells undergo large, irreversible deformations. It 
emerges that stretched mammalian cells grown in vitro can enter a state called 
superelasticity, in which large, reversible deformations occur. SEE ARTICLE P.203 


MANUEL THERY & ATEF ASNACIOS 


bedtime story’, the elephant’s elongated 

trunk arose because a crocodile grabbed 
“and pulled, and pulled, and pulled” on the 
nose of an elephant’s child. The elephant’s 
child escaped, but waited in vain for its nose 
to shrink back to normal. This scenario of an 
irreversible extension mirrors what happens 
in the laboratory when cells that are subject 
to external tension undergo major deforma- 
tion. However, Nature, Latorre et al.” report 
on page 203 that mammalian epithelial cells 
grown in vitro can, unexpectedly, demonstrate 
amode of reversible, large-scale shape changes 
— a property termed superelasticity. 

When our skin gets cut, it breaks apart at the 
wound site. This is because the surface of skin, 
like that of most organs, is subjected to tension. 
This tension helps to limit the size and sculpt 
the shape of organs. Moreover, a cell can both 
generate and resist tension. In the cytoplasm, 
there are fibre-like elements of the cell's struc- 
tural ‘skeleton, called cytoskeletal filaments, that 
can transmit force. The type of cytoskeletal fila- 
ments that form from the protein actin can be 
moved by myosin proteins to generate the con- 
tractile forces that regulate cell shape. Adhesion 


1E Rudyard Kipling’s classic children’s 
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sites that join cells together can relay this force 
between cells and cause tension to build up 
throughout an entire tissue’. However, cells 
under tension do not usually tear apart, because 
their material properties enable them to resist 
this tension*”. 

If cells under tension undergo small-scale 
deformations, the resulting changes are mainly 
elastic, and a linear relationship exists between 
an increase in tension and an increase in defor- 
mation”®, But in large-scale deformations, cells 
can enter a state termed plasticity, in which 
the breakage of bonds between cytoskeletal 
filaments leads to irreversible deformations 
that prevent full cellular recovery, even if the 
associated stress is released’. Latorre and col- 
leagues describe a mechanism whereby cells 
under tension that undergo large-scale defor- 
mations change from being in an elastic state 
to enter a regime in which the cells elongate 
without requiring an increase in tension. 
Moreover, these deformations are reversible, 
indicating that cells can shift from an elastic 
state to what is called a superelastic state, and 
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thus avoid entering a state of irreversible 
deformation. 

Why has superelasticity not been 
previously detected in living cells, despite 
decades of investigations into cellular 
properties? One explanation could be 
that the timescale matters”. In previous 
experiments, external forces have usually 
been applied for seconds or minutes*”’, 
whereas Latorre and colleagues studied 
changes that occurred over several hours. 
Cells subjected to rapidly increasing ten- 
sion often rupture, even at low tension 
levels, in just a few minutes®, whereas 
even if the tension is 100 times higher, it 
can be resisted if cells stretch at their own 
rate, such as if they slowly spread out over 
a surface’. 

Latorre and colleagues grew monolay- 
ers of mammalian epithelial cells in vitro 
on a deformable substrate surface that 
enabled them to estimate the forces act- 
ing on the system. The authors exploited 
the ability of cells to pump water from the 
upper to the lower side of the cell (Fig. 1). 
This induced a build-up of water under- 
neath the cells, generating pressure that 
caused a dome-like bulging of the cell 
layer under tension. Using microscopy 
and physical-modelling techniques, 
Latorre and co-workers precisely meas- 
ured the tension in these cellular domes. 

The regularity of the curvature of the 
structures indicated constant tension, 
in which all of the cells experienced the 
same level of force. However, the uni- 
formity of deformation was lost above a 
certain deformation threshold, and some 
cells in the domes became more stretched 
than others. The reason this occurred 
in only some cells was probably due to 
variability in cellular mechanical proper- 
ties°*, wherein some cells were more sen- 
sitive than others to external tension. The 
highly stretched cells had entered a state 
of superelasticity: they were no longer 
resistant to the tension and instead elon- 
gated under this constant force. It is as 
if the cells had transitioned towards ‘flu- 
idizatior of their cytoskeletal filaments’’; 
in other words, instead of behaving like a 
coiled spring that resists force, these cells 
entered a distinct mechanical state akin 
to the way in which a liquid flows. 

How do cells resist the application of 
force solely until a certain level of deformation 
is reached, yet after that point, transition to a 
state of superelasticity? One explanation could 
be the availability of actin. The pool of cellular 
actin is limited", and each actin-based structure 
assembles itself at the expense of other potential 
such structures. When all cellular pools of actin 
are exhausted, actin-based structures can't form 
any more. 

Latorre et al. focused on the cell cortex, a 
meshwork of actin filaments and myosin that 
forms a thin layer beneath the cell membrane. 


Mammalian 
epithelial cell 


Figure 1 | Cells can exist in a state of superelasticity. a, Latorre 
et al.” analysed the stretching of monolayers of mammalian 
epithelial cells grown in vitro on top ofa layer of fibronectin 
protein (blue). These cells can pump water from the upper to the 
lower side of the cell (green arrows). This resulted in a build-up 
of water beneath a dome-like layer of cells. The water build-up 
generates pressure and puts the cells under tension. b, The 
tension caused some stretched cells to undergo small-scale elastic 
deformations in which there is a linear relationship between the 
increase in tension and the extent of cellular deformation. The 
authors unexpectedly discovered that some cells underwent large- 
scale deformations and entered a state of superelasticity, in which 
cellular deformation increases without requiring a corresponding 
increase in tension. ¢, As pressure inside the dome-like layer of 
cells rose, rupture eventually occurred as a result of the breakage 
of an adhesive junction between cells; water escaped through this 
breakage point. d, The cellular deformations were fully reversible, 
and the cells returned to their initial small size. 


The authors measured cortex thickness, which 
is tightly coupled to the tension that the cortex is 
under’. Cortex thickness decreased with cellu- 
lar stretching, suggesting that a sufficiently thick 
cortex might be required to enable cellular elas- 
ticity, and raising the possibility that, below a 
certain thickness, the cortex stops resisting ten- 
sion and starts to ‘flow. However, the authors 
could not identify a clear transition in the 
structure of the cortex between that found in 
an elastic or in a superelastic state. 

Perhaps finer details of the actin-network 
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architecture are crucial for understanding 
the transition to superelasticity. The 
protein-mediated crosslinking of actin 
filaments ensures that components of 
the actin network are connected success- 
fully, and that they function as a whole, 
rather than as numerous independent 
units!®. It’s possible that, as the cortex 
thins during cell stretching, a point is 
reached when this network connectiv- 
ity is lost, and disconnected parts of 
the network start separating, or ‘flow- 
ing apart; under tension. If this is true, 
the density of actin-filament crosslink- 
ing proteins might be a key factor in 
the transition towards superelasticity. 
Moreover, the investigation of actin- 
network density and crosslinking dur- 
ing specific developmental stages might 
reveal whether superelastic deformations 
occur as tissues are being shaped during 
development. 

A cellular elongation process that 
does not require an increase in force for 
increased deformation will end when 
rupture occurs. Latorre and colleagues 
note that when the layer of cells rup- 
tured, holes appeared between adjacent 
cells. Pressurized water inside the dome 
escaped through the rupture point, the 
dome collapsed and the superelastically 
stretched cells returned to their initial 
unstretched size. That the adhesive junc- 
tions joining cells, rather than the cells 
themselves, are the points of weakness, 
is consistent with observations of tissue 
rupture made using externally stretched 
cellular monolayers". 

If the adhesive junctions had not rup- 
tured in Latorre and colleagues’ experi- 
ments, would the individual cells have 
kept on elongating? Probably not. A 
type of cytoskeletal structure called an 
intermediate filament might have an 
effect in this type of scenario. Intermedi- 
ate filaments have been under-studied in 
comparison with actin filaments because 
of their slow turnover dynamics and the 
absence of convenient experimental 
tools, such as drugs, that can disassemble 
them. However, their importance in cel- 
lular mechanics is gaining recognition”, 

Intermediate filaments make a sub- 
stantial contribution to the elasticity of 
stretched cells'* and can support exten- 
sive levels of stretching’. These filaments typi- 
cally form in a wheel-spoke-like architecture 
that connects the nucleus to junctions between 
epithelial cells”. This raised another question: 
might these filaments have a role in the resist- 
ance to tension at high strain that would allow 
cells to limit their deformation? 

Evidence directly supporting this possibility 
has been lacking. Latorre et al. used a laser to cut 
bundles of intermediate filaments in stretched 
cells in a state of superelasticity, and found that 
this induced cellular relaxation — a release 
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of stress and an elongation that increased the 
cellular area. This suggests that intermediate 
filaments might protect superelastic cells from 
undergoing an unlimited deformation by acting 
like springs that resist tension at high levels of 
deformation. In such circumstances, the abil- 
ity of intermediate filaments to return to their 
usual length after being stretched might even 
enable such cells to recover their initial shape 
when tension is released. 

Latorre and colleagues’ work has revealed a 
more complex relationship between cell size and 
the forces that cells experience than was previ- 
ously appreciated. Future studies should attempt 
to unravel the mechanisms that enable cells to 
enter a state of superelasticity and to recover 
from high levels of deformation. Now that we 
know cell shape is not an appropriate proxy 
for assessing cellular tension, it will be crucial 
to develop ways to accurately monitor tension 
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levels in tissues, so as to better understand the 
factors that influence tissue shape. m 


Manuel Théry is at UMR 1160, Paris Diderot 
University, Saint-Louis Hospital, CEA, 
INSERM, AP-HP. 75010 Paris, France, and 

at UMR5168, University Grenoble-Alpes, 
BIG/LPCV, CEA, CNRS, INRA, Grenoble, 
France. Atef Asnacios is at UMR7057 CNRS, 
Paris Diderot University, MSC, 75205 Paris 
Cedex 13, France. 

e-mails: manuel.thery@cea.fr; 
atef.asnacios@univ-paris-diderot.fr 


. Kipling, R. Just So Stories (Macmillan, 1902). 

. Latorre, E. et al. Nature 563, 203-208 (2018). 

. Trepat, X. et al. Nature Phys. 5, 426-430 (2009). 

. Fernandez, P., Pullarkat, P. A. & Ott, A. Biophys. J. 90, 
3796-3805 (2006). 

5. Gardel, M. L. et al. Proc. Natl. Acad. Sci. USA 103, 

1762-1767 (2006). 
6. Fabry, B. et al. Phys. Rev. Lett. 87, 1-4 (2001). 


PWNHeE 


Immune-cell crosstalk 
in multiple sclerosis 


Interactions between the B and T cells of the human immune system are 
implicated in the brain disease multiple sclerosis. It emerges that B cells make a 
protein that is also made in the brain, and that T cells recognize this protein. 


RICHARD M. RANSOHOFF 


hallmark of the disease multiple 

sclerosis is an inflammatory autoim- 

mune attack’ on the proteins of the 
myelin sheath, a structure that wraps around 
the nerve fibres that project from neurons. The 
myelin sheath provides protection and nour- 
ishment to nerve fibres and enables efficient 
transmission of nerve impulses. Myelin-sheath 
injury causes a range of symptoms, depend- 
ing on the neurons that are affected. Which 
immune-system cells and protein targets have 
key roles in the initiation and progression of 
multiple sclerosis is not fully understood, and 
such information might aid the development 
of new treatments. Writing in Cell, Jelcic et al.’ 
present an analysis of immune-system cells 
found in people with multiple sclerosis that 
deepens our understanding of how immune 
cells might contribute to this disease. 

One factor linked*“ to the risk of developing 
multiple sclerosis is the possession of a 
particular version of a protein called HLA. 
HLA proteins enable cells to display anti- 
gens — fragments of proteins — on their 
surfaces. If the receptor for an antigen (the 
T-cell receptor; TCR) ona T cell recognizes 
an antigen presented by an HLA protein, 
the T cell is activated to trigger an immune 
response against cells that express the antigen. 


Variations in the antigen-binding capacity of 
different HLA proteins and in the antigen- 
recognition capacity of TCRs enable the body 
to respond to a wide range of antigens asso- 
ciated with disease-causing microorganisms. 
However, there is a danger that if an HLA 
protein efficiently binds an antigen that is 
normally part of the body, and ifa T cell that 
recognizes the HLA-antigen complex is acti- 
vated, autoimmunity could develop. Such a 
mechanism might underlie the fact that the 
version of HLA called HLA-DR1S5 is a risk 
factor for multiple sclerosis’, and is estimated* 
to contribute 60% of the total genetic risk for 
developing the condition. 

T cells from people with multiple sclerosis 
are more prone to divide in vitro than are T cells 
from people without the condition’. Such cell 
division is reminiscent of the division that 
occurs as the result of normal immune-cell 
activation by an antigen stimulus, but in this 
case it does not seem to require the addition of 
an antigen stimulus to the sample of immune 
cells’. This suggests either that the normal 
requirement for antigen recognition is being 
bypassed, or that these T cells recognize an 
antigen that is present on other immune cells 
in the blood sample. Jelcic et al. investigated 
further, analysing in more detail the behaviour 
of immune cells in blood samples of people 
with multiple sclerosis. They convincingly 
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demonstrate that both T cells and another 
type of immune cell called a B cell from these 
samples could proliferate when grown in vitro. 
The authors term this type of division auto- 
proliferation, because it occurs spontaneously 
in vitro without the addition of an antigen. 

Jelcic et al. found that signalling through 
a TCR-initiated T-cell proliferation, and that 
cell proliferation was associated with the 
production by T cells of a signalling protein 
called IFN-y (Fig. 1), which is associated 
with multiple sclerosis’. IFN-y is a potent 
activator of a category of immune cells called 
macrophages, which directly damage the 
myelin sheath’ in multiple sclerosis. 

The authors implicate B-cell proliferation 
in driving T-cell autoproliferation, because 
neither T cells nor B cells divided if the cul- 
tured cells were exposed to a drug called 
ibrutinib. Ibrutinib inhibits the protein BTK, 
which is essential for signalling downstream 
of the B-cell antigen receptor that leads to 
B-cell proliferation®. Interestingly, a phase IIb 
clinical trial (see go.nature.com/2yhfphu) has 
reported preliminary evidence that the BTK 
inhibitor evobrutinib could potentially provide 
benefit for people with multiple sclerosis (see 
go.nature.com/2qtqby9). 

Each of the multiple-sclerosis treatments 
currently in use suppresses disease-asso- 
ciated brain inflammation, but in different 
ways. Jelcic et al. took advantage of this to 
test whether interactions between B cells and 
T cells are needed for T-cell autoprolifera- 
tion, and whether this phenomenon might be 
involved in processes that lead to the symp- 
toms of multiple sclerosis. 

The authors analysed blood samples from 
people with the disease who were receiving 
different anti-inflammatory treatments, and 
compared these results with control samples 
from people with the disease who were not 
receiving treatment. For those receiving an 
antibody called natalizumab, which causes 
an increase in the numbers of T cells and 


immature B cells in the blood, in vitro analysis 
showed that autoproliferation of B cells and 
T cells was increased compared with the 
controls. Samples from those receiving an 
antibody called rituximab, which depletes B 
cells from the bloodstream, had much-reduced 
T-cell proliferation compared with controls. 

This analysis of the effect of anti-inflam- 
matory treatments that affect T cells or B cells 
provides evidence consistent with the authors’ 
model that clinically relevant interactions 
between B cells and T cells occur in multi- 
ple sclerosis. For many years, it was generally 
thought that B cells do not have a role in mul- 
tiple sclerosis, because of results from animal 
studies”. This view changed when striking ben- 
efits were observed in clinical trials of B-cell 
depletion for multiple-sclerosis treatment”. 

Jelcic and colleagues needed to answer the 
question of whether the autoproliferating 
T cells contribute to the development of mul- 
tiple sclerosis. To address this tough problem, 
the authors analysed the cellular descend- 
ants of individual proliferating T cells from 
the blood of people with multiple sclerosis. 
They looked at the variable portions of the 
TCRs present in the cells because these vari- 
able regions provide a unique pattern, akin 
to a barcode, that can identify any T cell and 
its genetically identical descendants — which 
form a cellular lineage termed a clone. 

In rigorous and challenging experiments 
using material from two people who had mul- 
tiple sclerosis, the authors analysed T cells 
found in brain tissue taken at biopsy or post- 
mortem and compared these with T cells 
from the same person's blood samples taken 
before brain-tissue samples were obtained. The 
authors found that T cells from blood samples 
that underwent autoproliferation in vitro 
belonged to an identical cellular lineage that 
matched T cells found in the brain-tissue sam- 
ples taken from the same person. 

This finding strongly suggests that some 
proliferating cells in the blood of people with 
multiple sclerosis could enter their brain. Once 
there, such T cells might release immune- 
signalling molecules called cytokines, such 
as macrophage-stimulating IFN-y, that could 
initiate inflammatory tissue injury. Immune 
cells are always present in the cerebrospinal 
fluid that bathes the brain and spinal cord, 
and IFN-y-producing T cells'*"* and prolif- 
erating T cells’’ have previously been identi- 
fied in the cerebrospinal fluid of individuals 
with multiple sclerosis. Jelcic and colleagues’ 
findings therefore highlight the significance of 
previous observations and provide additional 
support for established models of how this dis- 
ease proceeds. 

A final conundrum remains: what are 
the antigenic target(s) of these T cells? This 
is a key question because the relevant self- 
antigens driving multiple sclerosis have not 
been definitively identified. To try to answer 
this, the authors analysed a clonal cell popu- 
lation grown in vitro, derived from one T cell 
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Figure 1 | Immune-cell action associated with multiple sclerosis. Jelcic et al.’ report that B cells of the 
immune system present in the bloodstream make a protein called RASGRP2. These cells use a protein 
called HLA to present a peptide fragment (an antigen) of RASGRP2 on their cell surface. If this antigen 
is recognized by the T-cell receptor (TCR) of another immune cell called a T cell, this interaction leads to 
the proliferation of both the T cells and the B cells, a phenomenon that the authors call autoproliferation. 
Their evidence indicates that these autoproliferating T cells can, by an unknown route, cross the blood— 
brain barrier to enter the brain. RASGRP2 is also found in brain tissue. If neurons or other brain cells 
express RASGRP2, this might trigger T cells that infiltrate the brain to orchestrate an autoimmune attack 
by producing inflammatory mediators. For example, the production of IFN-y proteins by activated 

T cells could stimulate the macrophages of the immune system, which are known®” to attack the myelin- 
sheath structure that protects nerve fibres and supports neuronal function. This in turn could lead to the 


development of multiple sclerosis. 


and its descendants that were present in the 
blood and the post-mortem brain of a person 
with multiple sclerosis. Jelcic and colleagues 
used an innovative approach that relied on 
computational evaluation of data obtained 
using standard methods to stimulate T cells. 
The authors evaluated an almost unimaginably 
large number of structurally similar but non- 
identical peptides for their capacity to act as 
an antigen that would generate a response 
from this T-cell population . They uncovered 
an antigen from the protein RASGRP2 as one 
that probably stimulates the TCR of this T-cell 
population. RASGRP2 had not previously 
been linked to processes related to multiple 
sclerosis. The authors demonstrated that RAS- 
GRP2 is expressed both in B cells that elicit 
T-cell proliferation and in brain tissue. 

Jelcic and colleagues’ study provides a 
model for how B-cell and T-cell interactions 
outside the brain might generate disease- 
contributing T cells that then enter the brain. 
Their discovery of an antigen associated with 
multiple sclerosis might, if other such antigens 
are identified in the future, reveal how auto- 
immunity occurs and, perhaps, how it could 
be remedied. m 
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Gene expression variability across cells 
and species shapes innate immunity 


Tzachi Hagai!?*, Xi Chen!, Ricardo J. Miragaia!*, Raghd Rostom!’, Tomas Gomes!, Natalia Kunowska!, Johan Henriksson!, 
Jong-Eun Park!, Valentina Proserpio*, Giacomo Donati*®, Lara Bossini-Castillo!, Felipe A. Vieira Braga!’, Guy Naamati’, 
James Fletcher®, Emily Stephenson®, Peter Vegh®, Gosia Trynka!, Ivanela Kondova’, Mike Dennis!°, Muzlifah Haniffa®", 
Armita Nourmohammad!*!%, Michael Lassig'* & Sarah A. Teichmann!*!>* 


As the first line of defence against pathogens, cells mount an innate immune response, which varies widely from cell 
to cell. The response must be potent but carefully controlled to avoid self-damage. How these constraints have shaped 
the evolution of innate immunity remains poorly understood. Here we characterize the innate immune response’s 
transcriptional divergence between species and variability in expression among cells. Using bulk and single-cell 
transcriptomics in fibroblasts and mononuclear phagocytes from different species, challenged with immune stimuli, 
we map the architecture of the innate immune response. Transcriptionally diverging genes, including those that encode 
cytokines and chemokines, vary across cells and have distinct promoter structures. Conversely, genes that are involved 
in the regulation of this response, such as those that encode transcription factors and kinases, are conserved between 
species and display low cell-to-cell variability in expression. We suggest that this expression pattern, which is observed 
across species and conditions, has evolved as a mechanism for fine-tuned regulation to achieve an effective but balanced 


response. 


The innate immune response is a cell-intrinsic defence program that is 
rapidly upregulated upon infection in most cell types. It acts to inhibit 
pathogen replication while signalling the pathogen’s presence to other 
cells. This programme involves the modulation of several cellular path- 
ways, including production of antiviral and inflammatory cytokines, 
upregulation of genes that restrict pathogens, and induction of cell 
death!” 

An important characteristic of the innate immune response is 
the rapid evolution that many of its genes have undergone along 
the vertebrate lineage**. This rapid evolution is often attributed to 
pathogen-driven selection*’. 

Another hallmark of this response is its high level of heterogeneity 
among responding cells: there is extensive cell-to-cell variability in 
response to pathogen infection®*? or to pathogen-associated molecular 
patterns (PAMPs)!*!!. The functional importance of this variability 
is unclear. 

These two characteristics—rapid divergence in the course of 
evolution and high cell-to-cell variability—seem to be at odds with 
the strong regulatory constraints imposed on the host immune 
response: the need to execute a well-coordinated and carefully bal- 
anced programme to avoid tissue damage and pathological immune 
conditions'?"'>. How this tight regulation is maintained despite rapid 
evolutionary divergence and high cell-to-cell variability remains 
unclear, but it is central to our understanding of the innate immune 
response and its evolution. 

Here, we study the evolution of this programme using two cells 
types—fibroblasts and mononuclear phagocytes—in different mam- 
malian clades challenged with several immune stimuli (Fig. 1a). 


Our main experimental system uses primary dermal fibroblasts, 
which are commonly used in immunological studies®!°. We compare 
the response of fibroblasts from primates (human and macaque) and 
rodents (mouse and rat) to polyinosinic:polycytidylic acid (poly(:C)), 
a synthetic double-stranded RNA (dsRNA; Fig. 1a, left). Poly(I:C) is 
frequently used to mimic viral infection as it rapidly elicits an antiviral 
response’®. 

We comprehensively characterize the transcriptional changes 
between species and among individual cells in their innate immune 
response. We use population (bulk) transcriptomics to investigate tran- 
scriptional divergence between species, and single-cell transcriptomics 
to estimate cell-to-cell variability in gene expression. Using promoter 
sequence analyses along with chromatin immunoprecipitation with 
sequencing (ChIP-seq), we study how changes in the expression of 
each gene between species and across cells relate to the architecture 
of its promoter. Furthermore, we examine the relationship between 
cross-species divergence in gene coding sequence and expression and 
constraints imposed by host-pathogen interactions. 

Additionally, we use a second system—bone marrow-derived mon- 
onuclear phagocytes from mouse, rat, rabbit and pig challenged with 
lipopolysaccharide (LPS), a commonly used PAMP of bacterial origin 
(Fig. 1a, right). 

Together, these two systems provide insights into the architecture of 
the immune response across species, cell types and immune challenges. 


Transcriptional divergence in immune response 
First, we studied the transcriptional response of fibroblasts to stimula- 
tion with dsRNA (poly(I:C)) across the four species (human, macaque, 
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Fig. 1 | Response divergence across species in innate immune response. 
a, Study design. Left, primary dermal fibroblasts from mouse, rat, human 
and macaque stimulated with dsRNA or controls. Samples were collected 
for bulk and single-cell RNA-seq and ChIP-seq. Right, primary bone 
marrow-derived mononuclear phagocytes from mouse, rat, rabbit and 

pig stimulated with LPS or controls. Samples were collected for bulk and 
single-cell RNA-seq. b, Left, fold-change (FC) in dsRNA stimulation 

in fibroblasts for sample genes across species (edgeR exact test, based 

on n=6, 5, 3 and 3 individuals from human, macaque, rat and mouse, 
respectively). Right, fold-change in LPS stimulation in phagocytes for 
sample genes across species (Wald test implemented in DESeq2, based 

on n=3 individuals from each species). False discovery rate (FDR)- 
corrected P values are shown (***P < 0.001, **P< 0.01, *P< 0.05). ¢, Top, 
estimating each gene's level of cross-species divergence in transcriptional 
response to dsRNA stimulation in fibroblasts. Using differential expression 
analysis, fold-change in dsRNA response was assessed for each gene 

in each species. We identified 1,358 human genes as differentially 
expressed (DE) (FDR-corrected q < 0.01), of which 955 had one-to-one 
orthologues across the four studied species. For each gene with one-to- 
one orthologues across all species, a response divergence measure was 
estimated using: response divergence = log[1/4 x )/;,(log[FC primate,] 

— [logFC rodent)])”]. Genes were grouped into low, medium and high 
divergence according to their response divergence values for subsequent 
analysis. Bottom, estimating each gene’s level of cross-species divergence in 
LPS response in mononuclear phagocytes. A response divergence measure 
was estimated using: response divergence = log[1/3 x )/(log[FC pig] 

— log[FC glire;])7] (where glires are mouse, rat and rabbit). 


rat and mouse). We generated bulk RNA-sequencing (RNA-seq) data 
for each species after 4h of stimulation, along with respective controls 
(see Fig. la and Methods). 

In all species, dsRNA treatment induced rapid upregulation of genes 
that encode expected antiviral and inflammatory products, including 
IFNB, TNF, IL1A and CCLS5 (see also Supplementary Table 3). Focusing 
on one-to-one orthologues, we performed correlation analysis between 
species and observed a similar transcriptional response (Spearman 
correlation, P< 10~!° in all comparisons; Extended Data Fig. 1), as 
reported in other immune contexts!”"!°. Furthermore, the response 
tended to be more strongly correlated between closely related species 
than between more distantly related species, as in other expression 
programmes””-**, 

We characterized the differences in response to dsRNA between 
species for each gene, using these cross-species bulk transcriptomics 
data. While some genes, such as those encoding the NF-kB subunits 
RELB and NFKB2, respond similarly across species, other genes 
respond differently in the primate and rodent clades (Fig. 1b, left). For 
example, [fi27 (which encodes a restriction factor against numerous 
viruses) is strongly upregulated in primates but not in rodents, whereas 
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Daxx (which encodes an antiviral transcriptional repressor) exhibits 
the opposite behaviour. 

Similarly, in our second experimental system, which consists of 
lipopolysaccharide (LPS)-stimulated mononuclear phagocytes from 
mouse, rat, rabbit, and pig (Fig. 1b, right), some genes responded sim- 
ilarly across species (for example, Nfkb2), whereas others were highly 
upregulated only in specific clades (for example, Phida1). 

To quantify transcriptional divergence in immune responses between 
species, we focused on genes that were differentially expressed during 
the stimulation (see Methods). For simplicity, we refer to these genes 
as ‘responsive genes’ (Fig. 1c). In this analysis, we study the subset of 
these genes with one-to-one orthologues across the studied species. 
There are 955 such responsive genes in dsRNA-stimulated human 
fibroblasts and 2,336 in LPS-stimulated mouse phagocytes. We define a 
measure of response divergence by calculating the differences between 
the fold-change estimates while taking the phylogenetic relationship 
into account (Methods, Supplementary Figs. 1-7 and Supplementary 
Table 4). 

For subsequent analyses, we split the 955 genes that were responsive 
in fibroblasts into three groups on the basis of their level of response 
divergence: (1) high-divergence dsRNA-responsive genes (the top 
25% of genes with the highest divergence values in response to dsRNA 
across the four studied species); (2) low-divergence dsRNA-responsive 
genes (the bottom 25%); and (3) genes with medium divergence across 
species (the middle 50%; Fig. 1c). We performed an analogous proce- 
dure for the 2,336 LPS-responsive genes in phagocytes. 


Promoter architecture of diverging genes 

Next, we tested whether divergence in transcriptional responses is 
reflected in the conservation of promoter function and sequence. 
Using ChIP-seq, we profiled active histone marks in the fibroblasts of 
all species. The presence of trimethylation of lysine 4 on histone H3 
(H3K4me3) in promoter regions of high-divergence genes was sig- 
nificantly less conserved between humans and rodents than was the 
presence of H3K4me3 in promoters of low-divergence genes (Extended 
Data Fig. 2). 

We then used the human H3K4me3 ChIP-seq peaks to define active 
promoter regions of the responsive genes in human fibroblasts. The 
density of transcription factor binding motifs (TFBMs) was signifi- 
cantly higher in the active promoter regions of high-divergence genes 
than in low-divergence genes (Fig. 2a). Notably, when comparing 
the conservation of the core promoter regions in high- versus low- 
divergence dsRNA-responsive genes, we found that genes that diverge 
highly in response to dsRNA show higher sequence conservation in 
this region (Fig. 2b). 

This unexpected discordance may be related to the fact that 
promoters of high- and low-divergence genes have distinctive archi- 
tectures, associated with different constraints on promoter sequence 
evolution'®”>”°, Notably, promoters containing TATA-box elements 
tend to have most of their regulatory elements in regions immediately 
upstream of the transcription start site (TSS). These promoters are thus 
expected to be more conserved. The opposite is true for CpG island 
(CGI)?6?7 promoters. Indeed, we found that TATA-boxes are associated 
with higher transcriptional divergence, while genes with CGIs diverge 
more slowly, both in fibroblasts and phagocytes (Fig. 2c; Extended 
Data Fig. 3). Thus, a promoter architecture enriched in TATA-boxes 
and depleted of CGIs is associated with higher transcriptional diver- 
gence, while entailing higher sequence conservation upstream of these 
genes!82627, 


Transcriptional divergence of cytokines 

We next investigated whether different functional classes among 
responsive genes are characterized by varying levels of transcriptional 
divergence. To this end, we divided responsive genes into categories 
according to function (such as cytokines, transcriptional factors and 
kinases) or the processes in which they are known to be involved (such 
as apoptosis or inflammation). 
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Fig. 2 | Transcriptionally divergent genes have unique functions 

and promoter architectures. a, TFBM density in active promoters 

and response divergence. For each gene studied in fibroblast dsRNA 
stimulation, the total number of TFBM matches in its H3K4me3 histone 
mark was divided by the length of the mark (human marks were used; 

n= 879 differentially expressed genes with ChIP-seq data). High- 
divergence genes have higher TFBM density than low-divergence genes 
(one-sided Mann-Whitney test). b, Promoter sequence conservation 

and response divergence in fibroblast dsRNA stimulation. Sequence 
conservation values are estimated with phyloP7 for 500 base pairs 
upstream of the transcription start site (TSS) of the human gene. Mean 
conservation values of each of the 500 base pairs upstream of the TSS are 
shown for high-, medium- and low-divergence genes (n = 840 genes). 
Genes that are highly divergent have higher sequence conservation 
(one-sided Kolmogorov-Smirnov test). The 95% confidence interval 

for predictions from a linear model computed by geom_loess function 

is shown in grey. c, Comparison of divergence in response of genes with 
and without a TATA-box and a CGI in fibroblast dsRNA stimulation and 
phagocyte LPS stimulation. TATA-box matches and CGI overlaps were 
computed with respect to the TSS of human genes in fibroblasts (n = 955 
genes), and to the TSS of mouse genes in phagocytes (n = 2,336). 

d, Distributions of divergence values of 9,753 expressed genes in fibroblasts, 
955 dsRNA-responsive genes and different functional subsets of the dsRNA- 
responsive genes (each subset is compared with the set of 955 genes using a 
one-sided Mann-Whitney test and FDR-corrected P values are shown). 

e, Distributions of divergence values of 6,619 expressed genes in phagocytes, 
2,336 LPS-responsive genes and different functional subsets of the LPS- 
responsive genes (each subset is compared with the set of 2,336 genes using 
a one-sided Mann-Whitney test and FDR-corrected P values are shown). 
Violin plots show the kernel probability density of the data. Boxplots 
represent the median, first quartile and third quartile with lines extending to 
the furthest value within 1.5 of the interquartile range (IQR). 


Genes related to cellular defence and inflammation—most notably 
cytokines, chemokines and their receptors (hereafter ‘cytokines’)— 
tended to diverge in response significantly faster than genes involved 
in apoptosis or immune regulation (chromatin modulators, transcrip- 
tion factors, kinases and ligases) (Fig. 2d, e, Extended Data Fig. 4, 
Supplementary Fig. 1). 

Cytokines also had a higher transcriptional range in response to 
immune challenge (a higher fold-change). Regressing the fold-change 
from the divergence estimates resulted in reduction of the relative 
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divergence of cytokines versus other responsive genes, but the differ- 
ence still remained (Supplementary Fig. 2). Cytokine promoters are 
enriched in TATA-boxes (17% versus 2.5%, P= 1.1 x 107%, Fisher’s 
exact test) and depleted of CGIs (14% versus 69%, P= 1.6 x 10~°), sug- 
gesting that this promoter architecture is associated both with greater 
differences between species (response divergence) and larger changes 
between conditions (transcriptional range). 


Cell-to-cell variability in immune response 

Previous studies have shown that the innate immune response displays 
high variability across responding cells**”°. However, the relationship 
between cell-to-cell transcriptional variability and response divergence 
between species is not well understood. 

To study heterogeneity in gene expression across individual cells, we 
performed single-cell RNA-seq in all species in a time course following 
immune stimulation. We estimated cell-to-cell variability quantitatively 
using an established measure for variability: distance to median (DM). 

We found a clear trend in which genes that were highly divergent 
in response between species were also more variable in expression 
across individual cells within a species (Fig. 3a). The relationship 
between rapid divergence and high cell-to-cell variability held true 
in both the 955 dsRNA-responsive genes in fibroblasts and the 2,336 
LPS-responsive genes in phagocytes. This can be observed across the 
stimulation time points and in different species (Extended Data Figs. 5, 6). 
We analysed in depth the relationship between transcriptional diver- 
gence and cell-to-cell variability by using additional immune stimula- 
tion protocols (Supplementary Figs. 8, 9), and different experimental 
and computational approaches (Extended Data Fig. 7, Supplementary 
Figs. 10-13). Notably, the trends we observed are not a result of 
technical biases due to low expression levels in either the bulk or the 
single-cell RNA-seq data (Supplementary Figs. 14, 15). 

Next, we examined the relationship between the presence of 
promoter elements (CGIs and TATA-boxes) and a gene’s cell-to-cell var- 
iability. Genes that are predicted to have a TATA-box in their promoter 
had higher transcriptional variability, whereas CGI-containing genes 
tended to have lower variability (Fig. 3b), in agreement with previous 
findings*!. Thus, both transcriptional variability between cells (Fig. 3b) 
and transcriptional divergence between species (Fig. 2c) are associated 
with the presence of specific promoter elements. 


Transcriptional variability of cytokines 
We subsequently compared the response divergence across species with 
the transcriptional cell-to-cell variability of three groups of responsive 
genes with different functions: cytokines, transcription factors, and 
kinases and phosphatases (hereafter ‘kinases’; Fig. 3c, Extended Data 
Fig. 8). In contrast to kinases and transcription factors, many cytokines 
display relatively high levels of cell-to-cell variability (Extended Data 
Fig. 9), being expressed only in a small subset of responding cells 
(Extended Data Fig. 10). This has previously been reported for several 
cytokines”®. For example, IFNB is expressed in only a small fraction 
of cells infected with viruses or challenged with various stimuli®!!*?. 
Here, we find that cells show high levels of variability in expression of 
cytokines from several families (for example, IFNB, CXCL10 and CCL2). 
Cell-to-cell variability of cytokines remains relatively high in 
comparison to kinases and transcription factors during a time course 
of 2, 4 and 8 h after dsRNA stimulation of fibroblasts (Extended Data 
Fig. 9). This pattern is similar across species, and can also be observed 
in LPS-stimulated phagocytes (Extended Data Fig. 9). Thus, the high 
variability of cytokines and their expression in a small fraction of 
stimulated cells across all time points is evolutionarily conserved. 
Cytokines tended to be co-expressed in the same cells, raising the 
possibility that their expression is coordinated (see Supplementary 
Information and Supplementary Fig. 16). We also identified genes 
whose expression was correlated with cytokines in human fibro- 
blasts and showed that their orthologues tend to be co-expressed with 
cytokines in other species. This set is enriched with genes known to be 
involved in cytokine regulation (Supplementary Table 5). 
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Fig. 3 | Cell-to-cell variability in immune response corresponds to 
response divergence. a, Comparison of divergence in response across 
species with transcriptional variability between individual cells. Top, 
fibroblast dsRNA stimulation (variability measured in n = 55 human 
cells, following 4h dsRNA stimulation). Bottom, phagocyte LPS 
stimulation (variability measured in n = 3,293 mouse cells, following 4 h 
LPS stimulation). Genes classified as high-, medium- or low-divergence 
according to level of response divergence. Cell-to-cell variability values of 
high-divergence genes were compared with those of low-divergence genes 
(one-sided Mann-Whitney test). b, Comparison of cell-to-cell variability 
of genes with and without a TATA-box and a CGI, in fibroblast dsRNA 
stimulation and phagocyte LPS stimulation (one-sided Mann-Whitney 
test). Cell-to-cell variability values are from DM estimations of human 
fibroblasts stimulated with dsRNA for 4 h (n =55 cells) and from mouse 
phagocytes stimulated with LPS for 4 h (n = 3,293 cells). c, Scatter plot 
showing divergence in response to dsRNA in fibroblasts across species and 
transcriptional cell-to-cell variability in human cells following 4 h of dsRNA 
stimulation (n= 684 dsRNA-responsive genes). Purple, cytokines; green, 
transcription factors; beige, kinases. The distributions of divergence and 
variability values of these groups are shown above and to the right of the 
scatter plot, respectively. d, A network showing genes that correlate positively 
in expression with the chemokine gene CXCL10 across cells (Spearman 
correlation, p > 0.3), in at least two species (one of which is human), 
following dsRNA treatment in fibroblasts (based on n= 146, 74, 175 and 170 
human, macaque, rat and mouse cells, respectively). Purple, cytokines; red, 
positive regulators of cytokine expression; blue, negative regulators. Colours 
of lines, from light to dark grey, reflect the number of species in which this 
pair of genes was correlated. Boxplots represent the median, first quartile and 
third quartile with lines extending to the furthest value within 1.5 x IQR. 


As an example, we focused on the genes whose expression is posi- 
tively correlated with the chemokine CXCL10 in at least two species 
(Fig. 3d). This set includes four cytokines co-expressed with CXCL10 
(in purple), as well as known positive regulators of the innate immune 
response and cytokine production (in blue), such as the viral sensors 
IFIH1 (also known as MDAS) and RIG-I (also known as DDX58) 
This is in agreement with previous evidence that IFNB expression is 
limited to cells in which important upstream regulators are expressed. 
at sufficiently high levels*'!>?, Here, we show that this phenomenon 
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of co-expression with upstream regulators applies to a wider set of 
cytokines and is conserved across species. Notably, cytokines were 
co-expressed not only with their positive regulators but also with genes 
that are known to act as negative regulators of cytokine expression or 
cytokine signalling (in red), suggesting that cytokine expression and 
function is tightly controlled at the level of individual cells. 


The evolutionary landscape of innate immunity 

Many immune genes, including several cytokines and their receptors, 
have been shown to evolve rapidly in coding sequence***. However, 
it is not known how divergence in coding sequence relates to tran- 
scriptional divergence in innate immune genes. Using the set of 955 
dsRNA-responsive genes in fibroblasts, we assessed coding sequence 
evolution in the three subsets of low-, medium- and high-divergence 
genes (as defined in Fig. 1c). 

We compared the rate at which genes evolved in their coding 
sequences with their response divergence by considering the ratio of 
non-synonymous (dN) to synonymous (dS) nucleotide substitutions. 
Genes that evolved rapidly in transcriptional response had higher 
coding sequence divergence (higher dN/dS values) than dsRNA- 
responsive genes with low response divergence (Fig. 4a). 

Rapid gene duplication and gene loss have been observed in several 
important immune genes***? and are thought to be a result of pathogen- 
driven pressure*™!, We therefore tested the relationship between a 
gene’s divergence in response and the rate at which the gene’s family 
has expanded and contracted in the course of vertebrate evolution. We 
found that transcriptionally divergent dsRNA-responsive genes have 
higher rates of gene gain and loss (Fig. 4b) and consequently are also 
evolutionarily younger (Fig. 4c, Supplementary Fig. 17). 

Previous reports have suggested that proteins encoded by younger 
genes tend to have fewer protein-protein interactions (PPIs) within 
cells”. Indeed, we found that rapidly diverging genes tend to have fewer 
PPIs (Fig. 4d). Together, these results suggest that transcriptionally diver- 
gent dsRNA-responsive genes evolve rapidly through various mecha- 
nisms, including fast coding sequence evolution and higher rates of gene 
loss and duplication events, and that their products have fewer inter- 
actions with other cellular proteins than those of less divergent genes. 

The interaction between pathogens and the host immune system 
is thought to be an important driving force in the evolution of both 
sides. We therefore investigated the relationship between transcrip- 
tional divergence and interactions with viral proteins by compiling a 
data set of known host-virus interactions in humans®***, Notably, 
genes whose products had no known viral interactions showed higher 
response divergence than genes encoding proteins with viral interac- 
tions (Fig. 4e). Furthermore, the transcriptional divergence of genes 
targeted by viral immunomodulators*°—viral proteins that subvert 
the host immune system—was lower still (Fig. 4e). These observations 
suggest that viruses have evolved to modulate the immune system 
by interacting with immune proteins that are relatively conserved in 
their response. Presumably, these genes cannot evolve away from viral 
interactions, unlike host genes that are less constrained*°. 

The summary of our results in Fig. 4f highlights the differences in 
both regulatory and evolutionary characteristics between cytokines 
and other representative dsRNA-responsive genes. Cytokines evolve 
rapidly through various evolutionary mechanisms and have higher 
transcriptional variability across cells. By contrast, genes that are 
involved in immune response regulation, such as transcription factors 
and kinases, are more conserved and less heterogeneous across cells. 
These genes encode proteins that have more interactions with other 
cellular proteins, suggesting that higher constraints are imposed on 
their evolution. This group of conserved genes is more often targeted 
by viruses, revealing a relationship between host-pathogen dynamics 
and the evolutionary landscape of the innate immune response. 


Discussion 
Here, we have charted the evolutionary architecture of the innate 
immune response. We show that genes that diverge rapidly between 
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Fig. 4 | Relationship of response divergence and other evolutionary 
modes. a—-d, dsRNA-responsive genes in fibroblasts are divided by level of 
response divergence into three groups, as in Fig. 1c. a, Coding sequence 
divergence, as measured using dN/dS values across 29 mammals. Higher 
dN/dS values denote faster coding sequence evolution (n = 567 genes). 

b, Rate at which genes were gained and lost within the gene family across 
the vertebrate clade (plotted as -logP). Higher values denote faster gene 
gain and loss rate (n = 955 genes). c, Evolutionary age (estimated with 
Panther7 phylogeny and Wagner reconstruction algorithm). Values denote 
the branch number with respect to human (distance from human in the 
phylogenetic tree); higher values indicate greater age (n = 931 genes). 

d, Number of known physical interactions with other cellular proteins 
(n= 955 genes). e, Distribution of transcriptional response divergence 
values among dsRNA-responsive genes whose protein products do 


species show higher levels of variability in their expression across indi- 
vidual cells than genes that diverge more slowly. Both of these charac- 
teristics are associated with a similar promoter architecture, enriched 
in TATA-boxes and depleted of CGIs. Notably, such promoter archi- 
tecture is also associated with the high transcriptional range of genes 
during the immune response. Thus, transcriptional changes between 
conditions (stimulated versus unstimulated), species (transcrip- 
tional divergence), and individual cells (cell-to-cell variability) may 
all be mechanistically related to the same promoter characteristics. In 
yeast, TATA-boxes are enriched in promoters of stress-related genes, 
displaying rapid transcriptional divergence between species and high 
variability in expression*”*”. This finding suggests intriguing analogies 
between the mammalian immune and yeast stress responses—two 
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not interact with viral proteins, interact with at least one viral protein, 

or interact with viral immunomodulators (m= 648, 307 and 25 genes, 
respectively). a~e, One-sided Mann-Whitney tests. f, A scaled heat map 
showing values of response divergence (as in Fig. 1c), cell-to-cell 
variability (as in Fig. 3a), coding sequence divergence (dN/dS values, as 
in a), gene age (as in c; younger genes have darker colours), number of 
cellular PPIs (as in d) and number of host-virus interactions (as in e), 
for example genes from three functional groups: cytokines, transcription 
factors, and kinases. Values are shown in a normalized scale between 0 
and 100, with the gene with the highest value assigned a score of 100. 
Missing values are shown in white. Boxplots represent the median, first 
quartile and third quartile with lines extending to the furthest value within 
1.5 x IQR. Violin plots show the kernel probability density of the data. 


systems that have been exposed to continuous changes in external 
stimuli during evolution. 

We have also shown that genes involved in regulation of the immune 
response—such as transcription factors and kinases—are relatively con- 
served in their transcriptional responses. These genes might be under 
stronger functional and regulatory constraints, owing to their roles 
in multiple contexts and pathways, which would limit their ability to 
evolve. This limitation could represent an Achilles’ heel that is used by 
pathogens to subvert the immune system. Indeed, we found that viruses 
interact preferentially with conserved proteins of the innate immune 
response. Cytokines, on the other hand, diverge rapidly between 
species, owing to their promoter architecture and because they have 
fewer constraints imposed by intracellular interactions or additional 
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non-immune functions. We therefore suggest that cytokines represent 
a successful host strategy to counteract rapidly evolving pathogens as 
part of the host-pathogen evolutionary arms race. 

Cytokines also display high cell-to-cell variability and tend to 
be co-expressed with other cytokines and cytokine regulators in a 
small subset of cells, and this pattern is conserved across species. As 
prolonged or increased cytokine expression can result in tissue 
damage**-°°, restriction of cytokine production to only a few cells 
may enable a rapid, but controlled, response across the tissue to avoid 
long-lasting and potentially damaging effects. 


Online content 

Any methods, additional references, Nature Research reporting summaries, source 
data, statements of data availability and associated accession codes are available at 
https://doi.org/10.1038/s41586-018-0657-2. 
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METHODS 


Ethical compliance. This project was approved by the Wellcome Sanger Institute 
Animal Welfare and Ethical Review Body, and complied with all relevant 
ethical regulations regarding animal research and human studies. Human cells 
were obtained from the Hipsci project*!, where they were collected from volunteers 
recruited from the NIHR Cambridge BioResource (written consent was given). 
Human skin profiling was performed in accordance with protocols approved 
by the Newcastle Research Ethics Committee (REC approval 08/H0906/95+5). 
Macaque skin samples were obtained from animals assigned to unrelated non- 
infectious studies, provided by Public Health England’s National Infection Service 
in accordance with Home Office (UK) guidelines and approved by the Public 
Health England Ethical Review Committee under an appropriate UK Home Office 
project license. 

Cross-species dermal fibroblast stimulation with dsRNA and IFNB. Tissue 
culture. We cultured primary dermal fibroblasts from low passage cells (below 
10) that originated from females from four different species (human (European 
ancestry), rhesus macaque, C57BL/6 (black 6) mouse and brown Norway rat). 
All skin samples were taken from shoulders. Stimulation experiments and library 
preparations were done in identical conditions across all species and for all genom- 
ics techniques. Details on the numbers of individuals used in each technique are 
listed in each technique’s section and in Supplementary Table 1. 

Human cells were obtained from the Hipsci project! (http://www.hipsci.org/). 
Rhesus macaque cells were extracted from skin tissues that were incubated for 2h 
with 0.5% collagenase B (Roche; 11088815001) after mechanical processing, and 
then filtered through 100-|1m strainers before being plated and passaged before 
cryo-banking. Rodent cells were obtained from PeloBiotech where they were 
extracted using a similar protocol. In vitro cultured fibroblasts from all four spe- 
cies resemble a particular in vivo cluster of dermal fibroblasts (see Supplementary 
Information). Cells were not tested for mycoplasma contamination. 

Prior to stimulation, cells were thawed and grown for several days in ATCC 
fibroblast growth medium (Fibroblast Basal Medium (ATCC, ATCC-PCS-201-030) 
with Fibroblast Growth Kit-Low serum (ATCC, PCS-201-041) (supplemented with 
Primocin (Invivogen, ant-pm-1) and penicillin/streptomycin (Life Technologies, 
15140122)) - a controlled medium that has proven to provide good growing con- 
ditions for fibroblasts from all species, with slightly less than 24 h doubling times. 
About 18 h before stimulation, cells were trypsinized, counted and seeded into 
6-well plates (100,000 cells per well). Cells were stimulated as follows: (1) stimu- 
lated with 1 j1g/ml high-molecular mass poly(I:C) (Invivogen,tlrl-pic) transfected 
with 2 g/ml Lipofectamin 2,000 (ThermoFisher, 11668027); (2) mock transfected 
with Lipofectamin 2,000; (3) stimulated with 1,000 IU of IFNB for 8 h (human 
IENB: 11410-2 (for human and macaque cells); rat IFNB: 13400-1; mouse IFNB: 
12401-1; all IFNs were obtained from PBL, and had activity units based on similar 
virological assays); or (4) left untreated. Interferon stimulation was used as a con- 
trol, to study how genes that were upregulated in the secondary wave of the innate 
immune response diverge between species. 

Additional human and mouse samples were stimulated with 1,000 IU of 
cross-mammalian IFN (CMI, or Universal Type I IFN Alpha, PBL, 11200-1). 
The latter stimulation was done to assess the effects of species-specific and batch- 
specific IFNB. 

In all of the above-mentioned stimulations, we used a longer time course for 
single-cell RNA-seq than for bulk RNA-seq, for two main reasons: (1) in the bulk, 
we chose to focus on one main stimulation time point for simplicity and to obtain 
an intuitive fold-change between stimulated and unstimulated conditions; (2) in 
single cells, when studying cell-to-cell variability, we chose to profile, in addition 
to the main stimulation time point, cells in earlier and later stages of the response. 
This is important for studying how the dynamics and magnitude of the response 
affect gene expression variability between responding cells. 

The poly(I:C) we used tested negative for the presence of bacterial beta-endotoxin 
using a coagulation test (PYROGENT Plus, 0.06 EU/ml sensitivity, N283-06). 
Bulk RNA-seg: library preparation and sequencing. For bulk transcriptomics 
analysis, cells from individuals from different species were grown in parallel and 
stimulated with dsRNA, IFNB (and cross-mammalian IFN) and their respective 
controls. In total, samples from 6 humans, 6 macaques, 3 mice and 3 rats were used. 
Total RNA was extracted using the RNeasy Plus Mini kit (Qiagen, 74136), using 
QlAcube (Qiagen). RNA was then measured using a Bioanalyzer 2100 (Agilent 
Technologies), and samples with RIN < 9 were excluded from further analysis (one 
macaque sample stimulated with poly(I:C) and its control). 

Libraries were produced using the Kapa Stranded mRNA-seq Kit (Kapa 
Biosystems, KK8421). The Kapa library construction protocol was modified 
for automated library preparation by Bravo (Agilent Technologies). cDNA was 
amplified in 13 PCR cycles, and purified using Ampure XP beads (Beckman 
Coulter, A63882) (1.8 volume) using Zephyr (Perkin Elmer). Pooled samples 
were sequenced on an Illumina HiSeq 2500 instrument, using paired-end 
125-bp reads. 
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ChIP-seq: library preparation and sequencing. Samples from three individuals from 
each of the four species were grown and stimulated (with poly(I:C) for 4 h or left 
untreated, as described above) in parallel to samples collected for bulk RNA-seq. 
Following stimulation, samples were crosslinked in 1% HCHO (prepared in 
1x DPBS) at room temperature for 10 min, and HCHO was quenched by the 
addition of glycine at a final concentration of 0.125 M. Cells were pelleted at 
4°C at 2,000g, washed with ice-cold 1x DPBS twice, and snap-frozen in liquid 
nitrogen. Cell pellets were stored at -80°C until further stages were performed. 
ChIPmentation was performed according to version 1.0 of the published proto- 
col” with a few modifications (see additional details in Supplementary Methods). 

Library preparation reactions contained the following reagents: 10 \1l puri- 
fied DNA (from the above procedure), 2.5 jl PCR Primer Cocktails (Nextera kit, 
Illumina, FC-121-1030), 2.5 ul N5xx (Nextera index kit, Illumina FC-121-1012), 
2.5 pl N7xx (Nextera index kit, Illumina, FC-121-1012), 7.5 pl NPM PCR Master 
Mix (Nextera kit, Illumina, FC-121-1030). PCR cycles were as follows: 72°C, 5 min; 
98°C, 2 min; [98°C, 10 s, 63°C, 30 s, 72°C, 20 s] x 12; 10°C hold. 

Amplified libraries were purified by double AmpureXP bead purification: first 
with 0.5 x bead ratio, keep supernatant, second with 1.4x bead ratio, keep bound 
DNA. Elution was done in 20 il Buffer EB (QIAGEN). 

One microlitre of library was run on a Bioanalyzer (Agilent Technologies) to 

verify normal size distribution. Pooled samples were sequenced on an Illumina 
HiSeq 2000 instrument, using paired-end 75-bp reads. 
Flow cytometry for single-cell RNA-seq. For sCRNA-seq, we performed two biologi- 
cal replicates, with each replicate having one individual from each of the four stud- 
ied species. A time course of dsRNA stimulation of 0, 4, and 8 h was used in one 
replicate (divided into two technical replicates), while the second replicate included 
a time course of 0, 2, 4, and 8 h. Poly(I:C) transfection was done as described above. 
In the case of sorting with IFNLUX, we used rhodamine-labelled poly(I:C). 

Cells were sorted with either Beckman Coulter MoFlo XDP (first replicate) 
or Becton Dickinson INFLUX (second replicate) into wells containing 2 1] lysis 
buffer (1:20 solution of RNase Inhibitor (Clontech, 2313A) in 0.2% v/v Triton 
X-100 (Sigma-Aldrich, T9284)), spun down and immediately frozen at -80°C. 

When sorting with MoFlo, a pressure of 15 psi was used with a 150-\um nozzle, 
using the ‘Single’ sort purity mode. Dead or late-apoptosis cells were excluded 
using propidium iodide at 1 j1g/ml (Sigma, Cat Number P4170) and single 
cells were selected using FSC W versus FSC H. When sorting with INFLUX, a 
pressure of 3 psi was used with a 200-j1m nozzle, with the ‘single’ sort mode. Dead 
or late-apoptosis cells were excluded using 100 ng/ml DAPI (4’,6-diamidino-2- 
phenylindole) (Sigma, D9542). DAPI was detected using the 355-nm laser 
(50 mW), using a 460/50 nm bandpass filter. Rhodamine was detected using 
the 561-nm laser (50mW), using a 585/29 nm bandpass filter. Single cells were 
collected using FSC W versus FSC H. 

Library preparation from full-length RNA from single cells and sequencing. 
Sorted plates were processed according to the Smart-seq2 protocol®’. Oligo-dT 
primer (IDT), dNTPs (ThermoFisher, 10319879) and ERCC RNA Spike-In Mix 
(1:25,000,000 final dilution, Ambion, 4456740) were added to each well, and 
reverse transcription (using 50 U SmartScribe, Clontech, 639538) and PCR were 
performed following the original protocol with 25 PCR cycles. cDNA libraries were 
prepared using Nextera XT DNA Sample Preparation Kit (Illumina, FC-131-1096), 
according to the protocol supplied by Fluidigm (PN 100-5950 B1). Quality Checks 
on cDNA were done using a Bioanalyser 2100 (Agilent Technologies). Libraries 
were quantified using the LightCycler 480 (Roche), pooled and purified using 
AMPure XP beads (Beckman Coulter) with Hamilton 384 head robot (Hamilton 
Robotics). Pooled samples were sequenced on an Illumina HiSeq 2500 instrument, 
using paired-end 125-bp reads. 

Read mapping to annotated transcriptome. For bulk RNA-seq samples, adaptor 
sequences and low-quality score bases were first trimmed using Trim Galore 
(version 0.4.1) (with the parameters “-paired—quality 20-length 20 -e 0.1-adapter 
AGATCGGAAGAGC)). Trimmed reads were mapped and gene expression was 
quantified using Salmon (version 0.6.0)*4 with the following command: ‘salmon 
quant -i [index_file_directory] -1 ISR -p 8-biasCorrect-sensitive—extraSensitive 
-o [output_directory] -1 -g [ENSEMBL_transcript_to_gene_file]-useFSPD- 
numBootstraps 100. Each sample was mapped to its respective species annotated 
transcriptome (downloaded from ENSEMBL, version 84: GRCh38 for human, 
MMUL_1 for macaque, GRCm38 for mouse, Rnor_6.0 for rat). We included only 
the set of coding genes (*.cdna.all.fa files). We removed annotated secondary hap- 
lotypes of human genes by removing genes with ‘CHR_HSCHR. 

Quantifying differential gene expression in response to dsRNA. To quantify differen- 
tial gene expression between treatment and control for each species and for each 
treatment separately, we used edgeR (version 3.12.1)°° using the rounded estimated 
counts from Salmon. This was done only for genes that had a significant level of 
expression in at least one of the four species (TPM >3 in at least N - 1 libraries, 
where N is the number of different individuals we have for this species with libraries 
that passed quality control, and TPM is transcripts per million). Differential 
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expression analysis was performed using the edgeR exact test, and P values were 
adjusted for multiple testing by estimating the false discovery rate (FDR). 
Conservation and divergence in immune response: fold-change-based analysis. We 
compared the overall change in response to treatment (dsRNA or IFNB) between 
pairs of species, by computing the Spearman correlation of the fold-change in 
response to treatment across all one-to-one orthologues that were expressed in 
at least one species (Extended Data Fig. la—h). Fold-change was calculated with 
edgeR, as described above. Spearman correlations of all expressed genes appear 
in grey. Correlations of the subset of differentially expressed genes (genes with 
FDR-corrected P < 0.01 in at least one of the compared species) appear in black. 

In Extended Data Fig. la—c, we show comparisons in response to dsRNA. In 
Extended Data Fig. 1d-f, we show comparisons in response to IFNB, which we use 
here to study the similarity of the secondary immune response between species. 

We constructed a tree based on a gene’s change in expression in response to 
dsRNA and to IFNB, using expressed genes that had one-to-one orthologues across 
all four species and were expressed in at least one species in at least one condition 
(Extended Data Fig. 1i). We used hierarchical clustering, with the hclust com- 
mand from the stats R package, with the distance between samples computed as 
1 - p, where p is the pairwise Spearman correlation between each pair of species 
mentioned above (a greater similarity, reflected in a higher correlation, results in 
a smaller distance) and ‘average’ as the clustering method. 

The above-mentioned analyses focus on one-to-one orthologues between the 
compared species. In Supplementary Table 6, we quantify the similarity in response 
between species (based on Spearman correlations) when adding genes with 
one-to-many orthologues. 

Quantifying gene expression divergence in response to immune challenge. To quantify 
transcriptional divergence in immune response between species, we focus on genes 
that have annotated one-to-one orthologues across the studied species (human, 
macaque, mouse and rat). 9,753 of the expressed genes have annotated one-to-one 
orthologues in all four species, out of which 955 genes are differentially expressed 
in human in response to dsRNA treatment (genes with an FDR-corrected P< 0.01). 

We define a measure of response divergence (based on a previous study**) by 
calculating the differences between the fold-change estimates across the ortho- 
logues: response divergence = log[1/4 x );,(log[FC primate;] — log[FC rodent,])’]. 
This measure takes into account the structure of the phylogeny, and gives a relative 
measure of divergence in response across all genes with one-to-one orthologues. 

To consider differences between species, we focus on between-clade differences 
(primates versus rodents), rather than on within-clade differences. In this way, 
we map the most significant macro-evolutionary differences along the longest 
branches of our four-species phylogeny. In addition, averaging within clades acts 
as a reduction of noise*. 

We compared this divergence measure to two other measures that use models 
(and incorporate both between- and within-clade divergence) and found a 
strong correlation between the divergence estimates across the three approaches 
(Supplementary Figs. 3, 4). 

In most of the subsequent analyses, we focus on the 955 dsRNA-responsive 
genes: genes that were differentially expressed in response to dsRNA (genes that 
have an FDR-corrected P < 0.01 in human, and have annotated one-to-one ortho- 
logues in the other three species). For some of the analyses, we split these 955 genes 
based on quartiles, into genes with high, medium and low divergence (Fig. 1c). 

We also studied how imprecisions in the fold-change estimates affected the 
response divergence estimates and subsequent analyses (Supplementary Figs. 5, 6). 
Comparison of response divergence between different functional groups. To compare 
the divergence rates between sets of dsRNA-responsive genes that have different 
functions in the innate immune response, we split these 955 genes into the fol- 
lowing functional groups (all groups are mutually exclusive, and any gene that 
belongs to two groups was excluded from the latter group; human gene annota- 
tions were used). 

We first grouped genes by annotated molecular functions: viral sensors (genes 
that belong to one of the GO categories: GO:0003725 (dsRNA binding), GO:0009597 
(detection of virus), and GO:0038187 (pattern recognition receptor activity)); 
cytokines, chemokines and their receptors (GO:0005125 (cytokine activity), 
GO:0008009 (chemokine activity), GO:0004896 (cytokine receptor activity), and 
GO:0004950 (chemokine receptor activity)); transcription factors (taken from the 
Animal Transcription Factor DataBase (version 2.0)°”); chromatin modulators 
(GO:0016568 (chromatin modification), GO:0006338 (chromatin remodelling), 
GO:0003682 (chromatin binding), and GO:0042393 (histone binding)); kinases 
and phosphatases (GO:0004672 (protein kinase activity) and GO:0004721 
(phosphoprotein phosphatase activity)); ligases and deubiquitinases (GO:0016579 
(protein deubiquitination), GO:0004842 (ubiquitin-protein transferase activity) and 
GO:0016874 (ligase activity); and other enzymes (mostly involved in metabolism 
rather than regulation: GO:0003824 (catalytic activity)). The divergence response 
values of these functional subsets were compared to the entire group of 955 
dsRNA-responsive genes (Fig. 2d, e). 


Next, we grouped genes by biological processes that are known to be important 
in the innate immune response: antiviral defence (GO:0051607 (defence response 
to virus)); inflammation (GO:0006954 (inflammatory response)); apoptosis 
(GO:0006915 (apoptotic process)); and regulation (GO annotations related to 
regulation of innate immune response pathways include only few genes. We thus 
used as the group of genes related to regulation, the merged group of genes that are 
annotated as transcription factors, chromatin modulators, kinases and phosphates 
or ligases and deubiquitinases, since all these groups include many genes that are 
known to regulate the innate immune response.) 

Gene lists belonging to the mentioned GO annotations were downloaded using 

QuickGo*™. The distribution of response divergence values for each of the func- 
tional groups was compared with the distribution of response divergence of the 
entire set of dsSRNA-responsive genes. Cytokines, chemokines and their receptors 
are merged in Fig. 2d, e, 3c. Analogous comparisons of functional groups in IFNB 
response (with 841 IFNB-responsive genes) are shown in Supplementary Fig. 1. 
See additional analyses in Supplementary Information. 
Alignment and peak calling of ChIP-seq reads. ChIP-seq reads were trimmed using 
trim_galore (version 0.4.1) with “-paired-trim1-nextera flags. The trimmed reads 
were aligned to the corresponding reference genome (hg38 for human, rheMac2 
for macaque, mm10 for mouse, rné for rat; all these genomes correspond to the 
transcriptomes used for RNA-seq mapping) from the UCSC Genome Browser”? 
using bowtie2 (version 2.2.3) with default settings™. In all four species, we removed 
the Y chromosome. In the case of human, we also removed all alternative haplotype 
chromosomes. Following alignment, low-confident mapped and improperly paired 
reads were removed by samtools® with ‘-q 30 -f 2’ flags. 

Enriched regions (peaks) were called using MACS2 (v.2.1.1) with a corrected 

P value cutoff of 0.01 with ‘-f BAMPE -q 0.01 -B-SPMR flags, using input DNA 
as control. The genome sizes (the argument for ‘-g’ flag) used were ‘hs’ for human, 
‘mm for mouse, 3.0 x 10° for macaque and 2.5 x 10° for rat. Peaks were consid- 
ered reproducible when they were identified in at least two of the three biological 
replicates and overlapped by at least 50% of their length (non-reproducible peaks 
were excluded from subsequent analyses). Reproducible peaks were then merged to 
create consensus peaks from overlapping regions of peaks from the three replicates 
by using mergeBed from the bedtools suite™. 
Gene assignment and conservation of active promoters and enhancers. We subse- 
quently linked human peaks with the genes they might be regulating as follows: 
H3K4me3 consensus peak was considered the promoter region of a given gene if 
its centre was between 2 kb upstream and 500 bp downstream of the annotated 
TSS of the most abundantly expressed transcript of that gene. 

Similarly, H3K27ac was considered the enhancer region of a given gene if 
its centre was in a distance above 1 kb and below 1 Mb, and there was no overlap 
(of 1 bp or more) with any H3K4me3 peak. 

In each case where, based on the distance criteria, more than a single peak was 
linked to a gene (or more than a single gene was linked to a peak), we took only 
the closest peak—gene pair (ensuring that each peak will have up to one gene and 
vice versa). 

To compare active promoters and enhancers between species, we excluded any 
human peak that could not be uniquely mapped to the respective region in the 
other species. This was done by looking for syntenic regions of human peaks in 
the other three species by using liftOver™, and removing peaks that had either 
unmapped regions or more than one mapped region in the compared species. 
We considered syntenic regions with at least 70% sequence similarity between the 
species (minMatch = 0.7, and 0.8 in the case of human-macaque comparison), 
with a minimal length (minSizeQ and minSizeT) corresponding to the length of 
the shortest peak (128 bp in H3K4 and 142 bp in H3K27). 

We defined an active human promoter or enhancer as conserved if a peak 

was identified in the corresponding region of the other species (we repeated this 
analysis by comparing human with each of the other three species separately). We 
compared the occurrence of conserved promoters and enhancers in genes that 
are highly divergent in response to dsRNA with low-divergence genes, and used 
Fisher's exact test to determine the statistical significance of the observed differ- 
ences between high- and low-divergence genes (Extended Data Fig. 2). 
Promoter sequence analysis. To calculate the total number of transcription factor 
binding motifs in a gene’s active promoter region, we downloaded the non- 
redundant JASPAR core motif matrix (pfm_vertebrates.txt) from the JASPAR 2016 
server® and searched for significant matches for these motifs using FIMO® in 
human H3K4me3 peaks. The TFBM density of peaks was calculated by dividing 
the total number of motif matches in a peak by the peak’s length. TBFM density 
values in human H3K4me3 peaks linked with high- and low-divergence genes 
were compared (Fig. 2a). 

PhyloP7 values were used to assess promoter sequence conservation®”. Sequence 
conservation quantification was performed by taking the estimated nucleotide sub- 
stitution rate for each nucleotide along the promoter sequence (500 bp upstream of 
the TSS of the relevant human gene). When several annotated transcripts existed, 
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the TSS of the most abundantly expressed transcript was used (based on bulk RNA 
data). The substitution rate values from all genes were aligned, based on their TSS 
position, and a mean for each of the 500 positions was calculated separately for the 
group of genes with high, medium and low response divergence. The two-sample 
Kolmogorov-Smirnov test was used to compare the paired distribution of rates 
between the means of the high-divergence and low-divergence sets of genes. To plot 
the mean values of the three sets of divergent genes, the geom_smooth function 
from the ggplot2 R package was used with default parameters (with loess as the 
smoothing method) (Fig. 2b). 

Human CGI annotations were downloaded from the UCSC genome table 
browser (hg38), and CGI genes were defined as those with a CGI overlapping 
their core promoter (300 bp upstream of the TSS reference position, and 100 bp 
downstream of it, as suggested previously'’). Genes were defined as having a TATA 
box if they had a significant match to the Jaspar TATA box matrix (MA0108.1) in 
the 100 bp upstream of their TSS by FIMO with default settings (we used a 100 bp 
window owing to possible inaccuracies in TSS annotations). We note that only 
28 out of 955 dsRNA-responsive genes had a matching TATA-box motif in this 
region. For both TATA and CGI analyses, the promoter sequences of the human 
orthologues were used. 

Read mapping and quality control of sCRNA-seg (full-length RNA). Gene expression 
was quantified in a manner similar to the quantification for bulk transcriptom- 
ics libraries described above. Low-quality cells were filtered using quality control 
criteria (cells with at least 100,000 mapped reads, with at least 2,000 expressed 
genes with TPM > 3, with ERCC < 10% and MT < 40%, where ERCC and MT 
refer to reads mapped to synthetic RNA Spike-In genes and mitochondrial genes). 
This quality control filtering resulted in 240 cells from a first biological replicate, 
including two technical replicates (with a time course of 0, 4, 8 h). In a second larger 
biological replicate (with a dsRNA stimulation time course of 0, 2, 4, 8 h), 728 cells 
passed quality control. Results throughout the manuscript relate to the second 
cross-species biological replicate in which a higher proportion of cells passed QC, 
and the lower-quality first replicate data were not considered further. 

Cell-to-cell variability analysis. To quantify the biological cell-to-cell variability 
of genes, we applied the DM (Distance to Median) approach—an established 
method, which calculates the cell-to-cell variability in gene expression while 
accounting for confounding factors such as gene expression level*”. This is done 
by first filtering out genes that are expressed at low levels: for Smart-seq2 data 
we included only genes that had an average expression of at least 10 size-factor 
normalized reads (except for Extended Data Fig. 9a, in which we reduced the 
threshold to 5, to allow a larger number of genes to be included in the comparisons). 
This procedure was done to filter genes that displayed higher levels of technical 
variability between samples owing to low expression. Second, to account for gene 
expression level, the observed cell-to-cell variability of each gene was compared 
with its expected variability, based on its mean expression across all samples and 
in comparison with a group of genes with similar levels of mean expression. This 
DM value is also corrected by gene length (in the case of Smart-seq2 data), yield- 
ing a value of variability that can be compared across genes regardless of their 
length and mean expression values®*. As a second approach, we used BASICS.” 
(see Supplementary Information). 

We note that the relationship observed in Fig. 3a between response diver- 
gence and cell-to-cell variability is not an artefact, stemming from differences in 
expression levels: (A) With respect to cell-to-cell variability, a gene’s expression 
level is controlled for by DM calculations, where expression level is regressed by 
using a running median (Supplementary Fig. 14). (B) Similarly, we can regress 
the expression level measured in bulk RNA-seq from the quantified response 
divergence by subtracting the running median of expression from the divergence 
estimates. When repeating the analysis comparing cell-to-cell variability versus 
regressed response divergence, the relationship between the two is maintained 
(Supplementary Fig. 15). 

Cytokine co-expression analysis. For the chemokine gene CXCL10, we built a 
network (using CytoScape’') of genes that correlate with CXCL10 in dsRNA- 
stimulated human fibroblasts and in at least one more species, using genes 
with a Spearman correlation value above 0.3 (see Fig. 3d and Supplementary 
Information). 

Coding sequence evolution analysis. The ratio dN/dS (non-synonymous to syn- 
onymous codon substitutions) of human genes across the mammalian clade was 
obtained from a previous study that used orthologous genes from 29 mammals”. 
Distributions of dN/dS values were computed for each of the three groups of genes 
with low, medium and high divergence in response to dsRNA, and are plotted in 
Fig. 4a. 

Rate of gene gain and loss analysis. The significance at which a gene’s family has 
experienced a higher rate of gene gain and loss in the course of vertebrate evolu- 
tion, in comparison with other gene families, was retrieved from ENSEMBL”?. 
The statistics provided by ENSEMBL are calculated using the CAFE method”, 
which estimates the global birth and death rate of gene families and identifies gene 
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families that have accelerated rates of gain and loss. Distributions of the P values 
from this statistic were computed for each of the three groups of genes with low, 
medium and high divergence in response to dsRNA and are plotted as the negative 
logarithm values in Fig. 4b. 

Gene age analysis. Gene age estimations were obtained from ProteinHistorian”®. 
To ensure that the results were not biased by a particular method of ancestral pro- 
tein family reconstruction or by specific gene family assignments, we used eleven 
different estimates for mammalian genes (combining five different databases of 
protein families with two different reconstruction algorithms for age estimation, 
as well as an estimate from the phylostratigraphic approach). For each gene, age 
was defined with respect to the species tree, where a gene’s age corresponds to the 
branch in which its family is estimated to have appeared (thus, larger numbers 
indicate evolutionarily older genes). 

Data for gene age in comparison with divergence in response to dsRNA are 
shown in Fig. 4c (using Panther7 phylogeny and Wagner reconstruction algorithm) 
and in Supplementary Fig. 17a (for all 11 combinations of gene family assignments 
and ancestral family reconstructions). See additional analyses in Supplementary 
Information. 

Cellular protein-protein interaction analysis. Data on the number of experimen- 
tally validated PPIs for human genes were obtained from STRING (version 10)”®. 
Distributions of PPIs for genes with low, medium and high divergence in response 
to dsRNA are plotted in Fig. 4d. 

Host-virus interaction analysis. Data on host-virus protein-protein interactions 
were downloaded from the VirusMentha database*?, and combined with two addi- 
tional studies that have annotated host-virus protein-protein interactions™. We 
split the 955 dsRNA-responsive genes into genes with known viral interactions 
(genes whose protein products were reported to interact with at least one viral 
protein), and genes with no known viral interactions: ‘viral interactors’ and ‘no 
viral interactions, respectively, in Fig. 4e. In addition, we define a subset of genes 
within the viral interactors set: those known to interact with viral proteins that 
are immunomodulators (proteins known to target the host immune system and 
modulate its response“). 

We note that the results presented in Fig. 4e are in agreement with previous 
analyses that are based on all human genes and on coding sequence evolution**. 
However, the overlap in the sets of genes between the previous analyses and the 
one presented here is small (for example, in one published study“® there were 535 
human genes with known interactions with pathogens, 57 of which overlap with 
the 955 genes that are the basis of the current analysis). 

Additional experiments with human fibroblasts and human skin tissue. 
Additional experiments were performed with human dermal fibroblasts and 
with cells extracted from human skin tissues to study in greater detail the 
relationship between response divergence across species and cell-to-cell variability. 
See Supplementary Methods and Supplementary Discussion for details. 
Cross-species bone marrow-derived phagocyte stimulation with LPS and 
dsRNA. Tissue culture. Primary bone marrow-derived mononuclear phagocytes 
originating from females of four different species (black 6 mouse, brown Norway 
rat, rabbit and pig) and cultured with GM-CSF, were obtained from PeloBiotech. 
Twenty-four hours before the start of the stimulation time course, cells were thawed 
and split into 12-well plates (500,000 cells per well). Cells were stimulated with: 
(1) 100 ng/ml LPS (Invivogen, tlrl-smlps), or with (2) 1 j1g/ml high-molecular 
mass poly(I:C) (Invivogen, tlrl-pic) transfected with 2 l/ml Lipofectamin 2,000 
(ThermoFisher, 11668027). LPS stimulation time courses of 0, 2, 4, 6 h were 
performed for all species. Poly(I:C) stimulations were performed for rodents for 
0, 2, 4, 6 h. We also processed cells for bulk RNA-seq for 0 and 4h stimulation 
time points. Details on the individuals used in each technique are listed in 
Supplementary Table 2. 

Library preparation for single cells using microfluidic droplet cell capture. Following 
stimulation, cells were collected using Cell Dissociation Solution Non-enzymatic 
(Sigma-Aldrich, C5914), washed and resuspended in 1 x PBS with 0.5% (w/v) 
BSA. Cells were then counted and loaded on the 10x Chromium machine aiming 
for a targeted cell recovery of 5,000 cells according to the manual. Libraries were 
prepared following the Chromium Single Cell 3’ v2 Reagent Kit Manual”. Libraries 
were sequenced on an Illumina HiSeq 4000 instrument with 26 bp for read 1 and 
98 bp for read 2. 

Library preparation and sequencing for bulk RNA-seq. Total RNA was extracted and 
libraries were prepared as described in the fibroblasts section. Pooled samples were 
sequenced on an Illumina HiSeq 4000 instrument, using paired-end 75-bp reads. 
Quantifying gene expression in bulk RNA-seq data. Adaptor sequences and 
low-quality score bases were trimmed using Trim Galore (version 0.4.1). Trimmed 
reads were mapped and gene expression was quantified using Salmon: (version 
0.9.1)*4 with the following command: ‘salmon quant -i [index_file_directory] / 
-LISR -p 8-seqBias—gcBias—posBias -q -o [output_directory] -1 -g [ENSEMBL_ 
transcript_to_gene_file]-useVBOpt-numBootstraps 100. Mouse samples were 
mapped to mouse transcriptome (ENSEMBL, version 84). We note that we used 
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the bulk data only for TSS analysis. For differential expression analysis, we used 
an in silico bulk from the single-cell data (see below). 

Quantifying gene expression in microfluidic droplet cell capture data. Microfluidic 
droplet cell capture data was first quantified using 10x Genomics’ Cell Ranger 
Single-Cell Software Suite (version 2.0, 10x Genomics Inc.)’” against the 
relevant genome (ENSEMBL, version 84). We removed cells with fewer than 
200 genes or more than 10% mitochondrial reads. To remove potential doublets, 
we excluded the top 10% of cells expressing the highest numbers of genes. Genes 
expressed in less than 0.5% of the cells were excluded from the calculations. 
We then filtered cells that expressed fewer than 10% of the total number of 
filtered genes. 

Since bone marrow-derived phagocytes may include secondary cell 

populations, we focused our analysis on the major cell population. We identified 
clusters within each data set, using the Seurat”® functions RunPCA, followed 
by FindClusters (using 20 dimensions from the PCA, default perplexity and a 
resolution of 0.1) and have taken the cells belonging to the largest cluster for 
further analysis, resulting in a less heterogeneous population of cells. A lower 
resolution of 0.03 was used for rabbit-LPS4, rabbit-LPS2, mouse-PIC2, mouse-PIC4; 
and 0.01 for rabbit-LPS6. 
Quantifying gene expression divergence in response to immune challenge. We cre- 
ated an in silico bulk table by summing up the UMIs of the post-QC single cells 
belonging to the largest cluster of cells, in each of the samples. We then used the 
three replicates in unstimulated conditions and in 4 h LPS stimulation to per- 
form a differential expression analysis using DESeq2”? Wald test, and P values 
were adjusted for multiple testing by estimating the FDR. A similar procedure was 
performed with mouse and rat dsRNA stimulation (with 4 h dsRNA stimulation 
versus unstimulated conditions). 

To quantify transcriptional divergence in immune response between species, we 
focused on genes that have annotated one-to-one orthologues across the studied 
species. 

We define a measure of response divergence by calculating the differences 
between the fold-change estimates across the orthologues: response diver- 
gence = log[1/3 x ))(log[FC pig] — log[FC glire;])”]. For each gene, the fold-change 
in the outer group (pig), is subtracted from the fold-change in the orthologues of 
the three glires (mouse, rat and rabbit), and the average of the square values of these 
subtractions is taken as the response divergence measure. In most of the analyses, 
we focus on the 2,336 LPS-responsive genes—genes that are differentially expressed 
in response to LPS (genes that have an FDR-corrected P< 0.01 in mouse, and have 
annotated one-to-one orthologues in the other three species). 

Promoter elements, gene function and cell-to-cell variability analyses. Promoter ele- 
ments (TATA and CGIs), gene function and cell-to-cell variability analyses were 
performed as described in the fibroblasts section. Mouse genes were used as the 
reference for gene function and TSS annotations. For variability analysis, we used 
one representative replicate out of three. 

Statistical analysis and reproducibility. Statistical analyses were done with 
R version 3.3.2 for Fisher’s exact test, two-sample Kolmogorov-Smirnov test 
and Mann-Whitney test. Data in boxplots represent the median, first quartile 
and third quartile with lines extending to the furthest value within 1.5 of the 
interquartile range (as implemented by the R function geom_boxplot). Violin plots 
show the kernel probability density of the data (as implemented by the R function 
geom_violin). 

All cross-species bulk RNA-seq replicates were successful, except for one 
macaque individual in which the treated sample had a low RNA quality and was 
removed from the analysis (along with the matching control). All cross-species 
ChIP-seq replicates were successful. Cross-species sCRNA-seq of fibroblasts was 
performed in two biological replicates. Results throughout the manuscript relate 
to the second cross-species biological replicate, for which a higher proportion of 
cells passed technical quality control. Three out of three replicates for each species 
and condition were successful when preparing single-cell libraries for mononu- 
clear phagocytes, except for two libraries that failed at the emulsion preparation 
stage. Two out of two replicates of single-cell in situ RNA hybridization assay were 
performed and both are shown. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Code availability. Scripts for major analyses are available at https://github.com/ 
Teichlab/innate_evo. 


Data availability 
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Extended Data Fig. 1 | Fibroblast response to dsRNA and IFNB across 
species. To study the similarity in response to treatment across species, 
we plotted the fold-change values of all expressed genes (with one-to-one 
orthologues) between pairs of species (human-macaque, mouse-rat and 
human-mouse) in response to dsRNA (poly(I:C)) (a-c). As a control, 

we performed the same procedure with IFNB stimulations (d-f). Fold- 
changes were inferred from differential expression analyses, determined 
by the exact test in the edgeR package® and based on n=6, 5, 3 and 3 
individuals from human, macaque, rat and mouse, respectively. Spearman 
correlations between all expressed one-to-one orthologues are shown in 
grey, Spearman correlations between the subset of differentially expressed 


I'd fold change 


dsRNA IFNB 


genes (FDR-corrected P< 0.01 in at least one species) appear in black. 
Number of genes shown is m= 11,035, 11,005, 11,137, 10,851, 10,826 

and 10,957 in a-f, respectively. Genes are coloured blue if they were 
differentially expressed (FDR-corrected P < 0.01) in both species, purple 
if they were differentially expressed in only one species, or red if they were 
not differentially expressed. g, h, Density plots of ratio of fold-change 

in response to dsRNA or to IFNB. g, Comparison between human and 
macaque orthologues in dsRNA response. h, Comparison between human 
and mouse orthologues in IFNB response. i, Dendrogram based on the 
fold-change in response to dsRNA or to IFNB across 9,835 one-to-one 
orthologues in human, macaque, rat and mouse. 
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Extended Data Fig. 2 | Correspondence of transcriptional divergence 
and divergence of active promoters and enhancers. Comparison of 
divergence in transcriptional response to dsRNA with divergence of active 
chromatin marks in active promoters (a, profiled using H3K4me3 in 
proximity to gene’s TSS) and enhancers (b, H3K27ac without overlapping 
H3K4me3). Chromatin marks were linked to genes on the basis of their 
proximity to the gene’s TSS. Chromatin marks were obtained from n= 3 
individuals in each of the four species, from fibroblasts stimulated with 
dsRNA or left untreated. The statistics are based on n = 855, 818 and 813 
human genes that have a linked H3K4me3 mark with a syntenic region 

in macaque, rat and mouse, respectively (a); and on n = 326, 241 and 242 
human genes that have a linked H3K27ac mark with a syntenic region in 
macaque, rat and mouse, respectively (b). Each panel shows the fraction of 
conserved marks between human and macaque, rat or mouse, in genes that 


Low Medium High 
have high, medium and low divergence in their transcriptional response. 
In each column, the histone mark’s signal was compared between human 
and the syntenic region in one of the three other species. If an active mark 
was found in the corresponding syntenic region, the linked gene was 
considered to have a conserved active mark (promoter or enhancer). The 
fractions of genes with conserved promoters (or enhancers) in each pair 
of species were compared between high- and low-divergence genes using 
a one-sided Fisher’s exact test. When comparing active promoter regions 
of high- versus low-divergence genes, we observe that low-divergence 
genes have a significantly higher fraction of conserved marks in rodents. 
This suggests an agreement between divergence at the transcriptional and 
chromatin levels in active promoter regions. In active enhancer regions, we 


do not observe these patterns, suggesting that the major contribution to 
divergence comes from promoters. 
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Extended Data Fig. 3 | Comparison of response divergence of genes 
containing various promoter elements. Comparison of response 
divergence between genes with and without a TATA-box and a CGI. Left, 
fibroblasts (n = 14, 14, 633 and 294 differentially expressed genes with 
only TATA-box element, with both CGI and TATA-box elements, with 
only CGI, and with neither element in their promoters, respectively); right, 
phagocytes (n = 13, 29, 1,718 and 576 differentially expressed genes with 
only a TATA-box element, with both CGI and TATA-box elements, with 
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only a CGI, and with neither element in their promoters, respectively). 
Genes with a TATA-box without a CGI have higher response divergence 
than genes with both elements. Genes with a CGI but without a TATA- 
box diverge more slowly than genes with both elements. Genes with both 
elements do not differ significantly in their divergence from genes lacking 
both elements (one-sided Mann-Whitney test). Data in boxplots represent 
the median, first quartile and third quartile with lines extending to the 
furthest value within 1.5 of the IQR. 
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Extended Data Fig. 4 | Response divergence of molecular processes genes in mononuclear phagocytes and subsets of this group belonging to 


upregulated in immune response. Left, distributions of divergence values _ different biological processes. For each functional subset, the distribution 
of n= 955 dsRNA-responsive genes in fibroblasts and subsets of this group _ of divergence values is compared with the set of 2,336 LPS-responsive 
belonging to different biological processes. For each functional subset, the | genes. FDR-corrected P values (one-sided Mann-Whitney test) are 


distribution of divergence values is compared with the set of 955 dsRNA- shown above each group and group size is shown inside each box. Data in 
responsive genes using a one-sided Mann-Whitney test. FDR-corrected boxplots represent the median, first quartile and third quartile with lines 
P values are shown above each group and group size is shown inside each extending to the furthest value within 1.5 of the IQR. 


box. Right, distributions of divergence values of n = 2,336 LPS-responsive 


© 2018 Springer Nature Limited. All rights reserved. 


ARTICLE 


2 Human Macaque Rat Mouse 
P=0.16 P=2.2x10* P=9.4x10° P=5.1x10% 


= 
a 
21 1 
2 1 
a3 
ae 
8 
8 
=4 0 o 
é 
Low Medium High Low Medium High Low Medium High Low Medium High 
5 paciats pegs por eanie pada 
7 2 
= 
a 
2 
B14 : : 
sg 
: | = 
ai 
4 
2 
Bo ° ° — a 0 = 
Low Medium High Low Medium High Low Medium High Low Medium High 
2 
P=3.8x1078 P=6x107 P=6.2x107? P=2.4x1077 
shes 2 
= 
a 
2 
24 1 41 
2 F 
as 
3 
2 
3 oO 0 
6° 0 
Low Low 


Low Medium High Medium High Low = Medium High Medium High 


P=2.1x10* P=7.4x10° P=3.3x107 P=6.5x10° 
a eo ss PSAs _—sPesntow 2 aa 
3 0 ) 0 < 
Low Medium High Low Medium High Low Medium High Low Medium High 

Response divergence Response divergence Response divergence Response divergence 
Extended Data Fig. 5 | Cell-to-cell variability versus response with dsRNA for 0, 2, 4 and 8 h, respectively. Rows represent different 
divergence across species and conditions in fibroblasts after dsRNA dsRNA stimulation time points (0, 2, 4 and 8 h), and columns represent 
stimulation. Cell-to-cell variability values, as measured with DM across different species as shown. High-divergence genes were compared with 
individual cells, compared with response divergence between species low-divergence genes using a one-sided Mann-Whitney test. Data in 
(grouped into low, medium and high divergence). Variability values are boxplots represent the median, first quartile and third quartile with lines 
based on n= 29, 56, 55, 35 human cells, n = 20, 32, 29, 13 rhesus cells, extending to the furthest value within 1.5 of the IQR. 


n= 33, 70, 65, 40 rat cells, and n = 53, 81, 59, 30 mouse cells, stimulated 


© 2018 Springer Nature Limited. All rights reserved. 


ARTICLE 


Mouse Rat Rabbit Pig 
P=6.6x10“* P=3.1x10°7 P=2.5x10"? P=3.4x10 
== Esmoy = _Pesixtoy Pee a= Pesantor 
= 0.3 oe 0.3 
z ae 0.4 0.2 
6 B 0.2 
8 01 O41 
S 00 == re I Ms 0.0 
0.1 0.1 -0.1 
Low Medium High Low Medium High Low Medium High Low Medium High 
Response divergence Response divergence Response divergence Response divergence 
oF P=1.3x108 P=1.2x10 P=4.4x10"« of P=1.5x103 
0.2 
= 03 04 03 
A of o4 2 
a : 0.2 
3 01 01 
— . 
§ 00 so 0.0 + == a 
0.1 0.1 0.1 
Low Medium High Low Medium High Low Medium High Low Medium High 
Response divergence Response divergence Response divergence Response divergence 
0.4 P=5.5x10° P=3.7x10% P=8x10"" 0.4 P=3.7x10" 
= 0.2 
= 03 0.4 03 
= 
a 0.2 0.1 02 
$ E 0.2 
3 01 01 
§ 00 0.0 
0.0 
-0.1 0.1 0.1 
Low Medium High Low Medium High Low Medium High Low Medium High 
Response divergence Response divergence Response divergence Response divergence 
04 P=4.6x10°" P=3.7x10°" P=4x1077 04 P=3.2x10%" 
02 
= 03 04 03 
is 
z 2 a4 0.2 
és 02 
3 O41 O41 
2 | -_ 
3 oo oo — oo 
0.1 0.1 0.1 
Low Medium High Low Medium High Low Medium High Low Medium High 
Response divergence Response divergence Response divergence Response divergence 
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after LPS stimulation. Cell-to-cell variability values, as measured with (0, 2, 4 and 6 h), and columns represent different species as shown. 
DM across cells, compared with response divergence between species High-divergence genes were compared with low-divergence genes using 
(grouped into low, medium and high divergence). Variability values are a one-sided Mann-Whitney test. Data in boxplots represent the median, 
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Extended Data Fig. 7 | Cell-to-cell variability of cytokine expression 

in single cell in situ RNA hybridization assay combined with flow 
cytometry (PrimeFlow). PrimeFlow measurement of two cytokine genes 
(IENB and CXCL10) that show high cell-to-cell variability in sCRNA- 

seq. As controls, two genes matched on expression levels (ATXN2L 

and ADAM32) but that show low cell-to-cell variability in scRNA-seq 

data are shown. As the expression of cytokines is at the low end of the 
distribution, we also chose two genes with middle-range expression values 
(ADAMTSL3 and BRD2) as additional controls. The experiment was 
performed in n=2 independent replicates, originating from the same 
individual. Both replicates are shown. a, Pseudocolour contour plot for 
RNA target expression in dsRNA-stimulated human fibroblasts. The x-axis 
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shows area of side scatter (SSC-A) and the y-axis shows fluorescent signal 
for target RNA probes. RNA targets detected by the same fluorescent 
channel are displayed together. Top, IFNB and control genes BRD2 and 
ATXN2L, type 1 probe, Alexa FluorTM 647. Bottom, CXCL10 and control 
genes ADAMTSL3 and ADAM32, type 10 probe, Alexa FluorTM 568. The 
cytokine genes display a broader range of fluorescence signal than the 
controls. b, Histograms comparing fluorescence of cytokine and control 
pairs (IFNB-BRD2 for type 1 probe and CXCL10-ADAM32 for type 10 
probe). The histograms show a bimodal distribution of expression signal 
for the two cytokine genes (IFNB and CXCL10, red), but not for controls 
(blue). This agrees with scRNA-seq data in which CXCL10 and IFNB 
display high levels of cell-to-cell variability. 
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Extended Data Fig. 8 | Cell-to-cell variability levels and response 
divergence of cytokines, transcription factors and kinases in response 
to LPS stimulation of phagocytes. A scatter plot showing divergence in 
response to LPS across species and transcriptional cell-to-cell variability 
in mouse mononuclear phagocytes following 4 h of LPS treatment, in 
n= 2,262 LPS-responsive genes. Purple, cytokines; green, transcription 
factors; beige, kinases. The distributions of divergence values and cell- 
to-cell variability values of each of the three functional groups are shown 


above and to the right of the scatter plot, respectively. 
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Extended Data Fig. 9 | Cell-to-cell variability levels in cytokines, 
transcription factors and kinases across species and stimulation time 
points. Violin plots showing the distribution of cell-to-cell variability 
values (DM) of cytokines, transcription factors and kinases during 
immune stimulation. Left, fibroblast dsRNA stimulation time course. 
Number of cells used in each species (at 2, 4, 8 h dsRNA, respectively): 
human, 56, 55, 35; macaque, 32, 29, 13; rat, 70, 65, 40; mouse, 81, 59, 30. 


Right, phagocyte LPS stimulation time course. Number of cells used in 
each species (at 2, 4, 6 h LPS, respectively): mouse, 4,321, 3,293, 2,126; rat, 
2,839, 1,963, 1,607; rabbit, 1,820, 1,522, 1,660; pig, 1,614, 1,899, 1,381. For 
both panels, colours as in Fig. 3c. Comparisons between groups of genes 
were performed using one-sided Mann-Whitney tests. Violin plots show 
the kernel probability density of the data. 
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Active superelasticity in three- 
dimensional epithelia of controlled shape 


Ernest Latorreb?, Sohan Kale?, Laura Casares!, Manuel Gémez-Gonzalez!, Marina Uroz!, Léo Valon!, Roshna V. Nair’, 
Elena Garreta!, Nuria Montserrat!*, Aranzazu del Campo*”, Benoit Ladoux®’, Marino Arroyo!** & Xavier Trepat+** 


Fundamental biological processes are carried out by curved epithelial sheets that enclose a pressurized lumen. How these 
sheets develop and withstand three-dimensional deformations has remained unclear. Here we combine measurements 
of epithelial tension and shape with theoretical modelling to show that epithelial sheets are active superelastic materials. 
We produce arrays of epithelial domes with controlled geometry. Quantification of luminal pressure and epithelial tension 
reveals a tensional plateau over several-fold areal strains. These extreme strains in the tissue are accommodated by highly 
heterogeneous strains at a cellular level, in seeming contradiction to the measured tensional uniformity. This phenomenon 
is reminiscent of superelasticity, a behaviour that is generally attributed to microscopic material instabilities in metal 
alloys. We show that in epithelial cells this instability is triggered by a stretch-induced dilution of the actin cortex, and is 
rescued by the intermediate filament network. Our study reveals a type of mechanical behaviour—which we term active 
superelasticity—that enables epithelial sheets to sustain extreme stretching under constant tension. 


Epithelial tissues enable key physiological functions, including mor- 
phogenesis, transport, secretion and absorption!. To perform these 
functions, epithelia often adopt a three-dimensional (3D) architec- 
ture that consists of a curved cellular sheet enclosing a pressurized 
fluid-filled lumen”. The loss of this 3D architecture is associated with 
developmental defects, inflammatory conditions and cancer*?. 

The acquisition of a 3D shape by epithelial sheets requires a tight 
control of cellular deformation, mechanical stress and luminal pres- 
sure. How these mechanical variables are tuned together to sculpt 3D 
epithelia is unknown, because current techniques to map epithelial 
mechanics are largely restricted to two-dimensional (2D) layers seeded 
on a flat substrate®’ or freely standing between cantilevers*. Here we 
report direct measurements of traction, tension, pressure and defor- 
mation in 3D epithelial monolayers of controlled size and shape. These 
measurements establish that epithelial monolayers exhibit active super- 
elasticity, an unanticipated mechanical behaviour that enables extreme 
deformations at nearly constant tension. 


Micropatterned epithelial domes 

We used transmural pressure as the morphogenetic driving force to 
shape epithelial monolayers in 3D. We seeded Madin-Darby canine 
kidney (MDCK) cells ona soft polydimethylsiloxane (PDMS) substrate 
that was homogeneously coated with fibronectin except for micropat- 
terned, non-adhesive areas of precise geometry (Fig. la). A few hours 
after seeding, cells covered the adherent regions of the gel, and with 
time they invaded the non-adherent areas*?. Because MDCK cells are 
known to actively pump osmolites in the apico-basal direction!"!, we 
reasoned that fluid pressure should build up in the interstitial space 
between cells and the impermeable substrate, which would lead to 
tissue delamination from the substrate in the non-adherent regions. 
Consistent with this rationale, we observed the spontaneous formation 
of multicellular epithelial domes that closely followed micropatterned 
shapes, such as circles, rectangles and stars (Fig. lb-e, Extended Data 


Fig. la-d). In contrast to spontaneous doming by delamination’™"', 


our control of the dome footprint gave us access to large variations in 
the dome aspect ratio (Fig. 1c-e). 


Measurement of dome mechanics 

To measure dome mechanics, we focused on circular patterns and 
implemented 3D traction microscopy to determine the three compo- 
nents of tractions at the surface of the PDMS substrate (Fig. 2a, b). 
Tractions in adherent regions showed large fluctuations without a clear 
spatial pattern (Fig. 2b). By contrast, non-adherent areas exhibited sys- 
tematic normal and nearly uniform negative tractions that indented 
the substrate. In a narrow annular region at the margin of the dome 
footprint, the traction vector consistently exhibited a positive normal 
component that pulled the substrate upward. These observations— 
along with the morphology of the domes—established that the lumen 
was in a state of hydrostatic pressure, and that the free-standing part 
of the monolayer sustained tension to balance this pressure (Fig. 2a). 

We then wondered whether we could map the tensional state of the 
dome, even though constituent cells did not directly generate tractions 
on the substrate. Epithelial domes followed a spherical cap geometry 
very closely (Fig. 2b), which implies that their surface tension (a) was 
isotropic, uniform and obeyed Laplace's law (20 =R x AP, where 
AP is the transmural pressure and R the radius of curvature of the 
dome; see Supplementary Note 1). This equation enabled us to meas- 
ure the epithelial tension of the domes, as the normal traction in the 
non-adherent regions provides a direct readout of AP and R could 
be measured from confocal stacks. We found tissue tensions in the 
millinewton per metre range, similar to previous measurements in 
2D monolayers>”. 

To test the principle behind our tension measurement, we perturbed 
the system with the Rho kinase inhibitor Y-27632, which is known to 
reduce tissue tension. Because the epithelial barrier has finite perme- 
ability to water, the enclosed volume—and hence R—cannot change 
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Fig. 1 | Generation of epithelial domes of controlled size and shape. 
a, Scheme of the process of dome formation. b, Top view of an array of 
15 x 15 epithelial domes (n= 10). Scale bar, 1 mm. c-e, Confocal x-y, y-z 


instantaneously upon this perturbation. Consequently, Laplace’s law 
requires that tension relaxation be paralleled by a pressure drop. This 
prediction was confirmed by our measurements (Fig. 2c-g, Extended 
Data Fig. 2a—c, Supplementary Video 1). We also examined water trans- 
port by subjecting domes to hyper-osmotic shocks (Supplementary 
Note 2). Volume dynamics under osmotic perturbations were consist- 
ent with a simple physical picture in which the epithelium behaves in 
a manner similar to a semi-permeable membrane actively pumping 
osmolites at nearly constant rate. 


Constitutive relation between dome tension and strain 
In the absence of pharmacological or osmotic perturbations and over 
timescales of hours, epithelial domes exhibited large volume fluctu- 
ations (Fig. 3a, Supplementary Video 2). These fluctuations involved 
periods of slow swelling and de-swelling combined with sudden vol- 
ume drops, often up to total dome collapse and subsequent rebirth. 
The magnitude of collapse events, presumably caused by localized 
disruptions of epithelial integrity, and the duration of swelling phases 
exhibited high variability (Fig. 3a, b, Extended Data Fig. 3). During 
these spontaneous fluctuations, we tracked luminal pressure and 
dome geometry, which provided a measurement of epithelial tension 
at different degrees of swelling (Fig. 3c-e, Supplementary Video 2). 
To examine these data, we represented tension in the free-standing 
tissue as a function of nominal areal strain of the dome ¢4= (h/a)’, 
which is defined as the difference between the actual area of the dome 
w(h? + a’) and the area of the non-adhesive region na’, normalized 
by the latter (see Fig. 2b for a definition of h and a). All domes exhib- 
ited tensions of about 1 mN m7! at small strains. At moderate strains 
(below 100%), tension progressively increased according to a highly 
reproducible law. Beyond this point, tension exhibited larger scatter 
but reached a plateau at about 2 mN m7! for areal strains up to 300% 
(Fig. 3e, Extended Data Fig. 4a). The existence of this plateau is nota- 
ble, as it reveals that epithelial domes maintain tensional homeosta- 
sis while undergoing deformations that change their area by up to 
fourfold. Human epithelial colorectal adenocarcinoma (Caco-2) cells 
showed a plateau at similar tension but lower strain (Extended Data 
Fig. 4b, c; see Supplementary Table 1 for a list of cell lines known to 
form domes). 

A number of mechanisms could contribute to such tensional homeo- 
stasis, including directed” or accelerated’? cell division, junctional 
network rearrangements", and cell exchange between domes and 
the adjacent adhered tissue. Visual examination of the domes showed 
that cell division and extrusion were rare (Supplementary Videos 3, 
4). Moreover, the number of cells in the dome remained constant 
during the several-fold increases in dome area (Extended Data Fig. 1f). 
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and x-z sections of MDCK-LifeAct epithelial domes (see ‘Cell culture’ in 
Methods) with a circular basal shape and varying spacing (n= 10). Scale 
bar, 100 um. 
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Fig. 2 | Measurement of luminal pressure and dome tension. a, Scheme 
of dome mechanics. The lumen is under uniform pressure AP (black 
arrows) and the free-standing monolayer is under surface tension 7 
(yellow arrows). b, Traction vectors of a dome of MDCK-LifeAct cells. 
Top, lateral view. Bottom, 3D traction maps overlaid on a top view of the 
dome. Yellow arrows represent in-plane components and the colour map 
represents the vertical component. Scale bar, 501m. Scale arrows, 150 Pa 
(representative of n= 13 domes). c, d, Tractions exerted by MDCK- 
LifeAct cells before (control) and after a 5-min incubation with 30 1M 

of Y-27632. Scale bar, 501m. Scale arrows, 75 Pa. e-g, Time evolution 

of dome volume and curvature (e), pressure (f) and tension (g) before 
(control) and after adding Y-27632. The time points corresponding to c,d 
are labelled in e-g (representative of n = 3 domes). 
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Fig. 3 | Constitutive relation between dome tension and strain. 

a, Spontaneous time evolution of tractions ina MDCK-LifeAct dome 
(y-z section). Scale bar, 501m. Scale arrows, 150 Pa. Regions in the dome 
monolayer that lack fluorescence signal correspond to unlabelled cells, 
not to gaps. b-d, Time evolution of spontaneous fluctuations in dome 
volume (b), pressure AP (c) and surface tension o (d) (representative of 
n=9 domes). e, Surface tension in the free-standing sheet as a function 
of nominal areal strain of the dome ¢q (n =9 domes, each sampled over 
various time points). The solid line and shaded area indicate mean + s.d. 
obtained by binning the data (n = 14 points per bin). f, Normalized dome 


We thus concluded that the tension-strain response of the tissue had 
to depend on the mechanics of cell stretching. 

To understand the tension-strain relation of the dome monolayer, 
we developed a theoretical vertex model in 3D!>°. The model is 
based on the well-established observation that the major determi- 
nant of epithelial-cell mechanics is the actin cortex!”. In the timescales 
of our experiments, this thin cytoskeletal network behaves in a manner 
similar to a fluid gel, and is capable of developing contractile 
tension owing to myosin motors!’. In 3D vertex models, these 
active tensions act along lateral (+) and apico-basal (74) faces of pol- 
yhedral cells (Fig. 3g, Supplementary Note 3). Assuming constant cell 
volume? and idealizing cells as regular hexagonal prisms of uniform 
thickness under uniform equibiaxial strain, this model predicts that 
the effective surface tension of the tissue depends on cellular areal 
strain €, as 


k V1 
(+e) 


T= Yab (1) 


3/2 


where k is a non-dimensional constant. This active constitutive relation 
recapitulates the initial increase in tension and the subsequent plateau 
at larger areal strain that are observed experimentally (Fig. 3e, f). The 
tendency of tension to plateau at large strains emerges naturally from 
the fact that the area of lateral faces decreases with cell stretching and, 
hence, tissue tension converges to apico-basal tension. To theoretically 
examine tissue stretching by dome swelling, we developed a compu- 
tational version of the vertex model shown in Fig. 3g (Supplementary 
Note 3). The tension-strain law evaluated using this computational 
approach closely matched the analytical constitutive relation in equa- 
tion (1) (Fig. 3f). 


surface tension as a function of areal strain calculated with the vertex 
model. The dashed blue line represents the cellular constitutive relation 

in equation (1), based on a sheet of identical hexagonal cells under 
uniform strain (€g=€,). The solid red line is the result of a multicellular 
computational vertex model for a dome with an initial geometry that was 
obtained experimentally. Insets show the computed dome shape at 50% 
(left) and 300% (right) nominal areal strain. g, Scheme of an idealized 
monolayer undergoing uniform equibiaxial stretching, representing model 
assumptions leading to equation (1). 


Although this simple theoretical framework captured the tension- 
strain relationship, it missed a notable experimental feature: during 
swelling and de-swelling, we systematically observed cells that barely 
changed area coexisting with cells that reached cellular areal strains of 
up to 1,000%, which is five times greater than the average dome strain 
(Fig. 4a, b, Extended Data Fig. 5a—e, Supplementary Videos 5-7). This 
extreme heterogeneity in strain is reminiscent of that observed in highly 
stretched epithelia in vivo, such as the trophoblast in human and mouse 
blastocysts'*? (Extended Data Fig. 5f, g). In both epithelial domes and 
blastocysts, strain heterogeneity would seem to be in contradiction 
with their spherical shape, which implies tensional uniformity. The 
heterogeneity of cellular strain increased sharply beyond areal strains 
of approximately 100% (Fig. 4a, Extended Data Fig. 5). This strain 
threshold coincides with the onset of the tensional plateau and with 
the increase in the scatter of tissue tension (Fig. 3e). 


Epithelial domes exhibit superelastic behaviour 

Taken together, our experiments show that epithelial domes exhibit 
large reversible deformations and a tensional plateau during which 
superstretched constitutive elements coexist with barely stretched 
ones. These uncommon material features are defining hallmarks of 
superelasticity, a behaviour that is observed in some inert materials 
such as nickel-titanium alloys”°. These materials are able to undergo 
large and reversible deformations at constant stress by heterogeneously 
switching between low- and high-strain phases”’. The microscopic 
trigger of superelasticity is a mechanical instability that results from a 
decreasing branch in the stress-strain relation of the material (strain 
softening). We reasoned that, by analogy with this behaviour, cell 
monolayers might behave as superelastic materials by switching from 
barely stretched to superstretched cellular states at constant tension. 
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Fig. 4 | Epithelial domes exhibit superelasticity. a, Cell strain ¢, versus 
dome strain ¢q during a deflation event for a subset of cells. Coloured 
curves correspond to cells labelled in b. Dashed line, ¢, = ¢q. Inset shows 
variance (Var) of €. versus €q. b, Deflating dome of MDCK-CAAX cells 
(see ‘Cell culture’ in Methods). Scale bar, 50,1m. c, Model prediction 

of stretch-induced cortical dilution. d, Sum of intensity projection and 
confocal section of a dome stained with phalloidin for F-actin. Scale bar, 
50m. e, Zoom of representative cells. Scale bar, 101m. f, F-actin intensity 
along the bands marked in e. AU, arbitrary units. g, Normalized density 

of cortical F-actin (stained with phalloidin) versus cellular strain (n = 68 
cells from 5 domes). h, Normalized density of cortical F-actin (SiR-actin) 
versus cellular strain during swelling (upward triangles) and de-swelling 
(downward triangles). n = 26 cells from 7 domes. Solid line and shaded 
area in g, h indicate mean +s.d. i, Live imaging of the cortex (SiR-actin) at 
two instants during swelling. j, Intensity profiles along bands shown in i. 
k, Non-monotonic cellular constitutive relation predicted by the vertex 
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model, accounting for softening by cortical depletion and re-stiffening 

at extreme cellular strains (blue line). Dome tension-strain relationship 
for the multicellular computational version of the same model (red line). 
Labels R1 to R4 correspond to panels shown in q and r. I-n, Dome of 
MDCK cells expressing keratin-18-GFP (green) stained for F-actin 
(phalloidin, red), and nuclei (Hoechst, blue) (1 = 3). Scale bar, 101m 

(1, n), 501m (m). 0, p, Changes in cell area after laser cuts of keratin 
bundles for weakly stretched (blue, n = 8) and superstretched cells 

(red, n =7), represented as cell area before and after cuts (0) and as 
normalized cell-area increment (p). **P = 0.0023, ***P < 0.0003, NS, not 
significant (0, P= 0.3282). Two-tailed Mann-Whitney tests. Mean +s.d. 
q, €c versus €q from the vertex model. Inset, variance of €, versus <4. 

r, Bottom, computed geometries during deflation. Top, effective potential 
energy landscape of active origin. Tilted by tissue tension, this landscape 
exhibits two wells at sufficiently high tension, corresponding to barely 
stretched and superstretched cellular states. 
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To explore this possibility further, we sought a strain-softening mech- 
anism that would explain the mechanical instability that underlies the 
transition between low- and high-strain phases. 

Because cellular deformations increased the surface area of the actin 
cortex by over threefold, we hypothesized that strain softening arose 
from the limited availability of cytoskeletal components”). Scarcity of 
cytoskeletal components could lead to stretch-induced cortical dilution, 
which could impair the ability of the cortex to generate active tension” 
(Supplementary Note 3). To test this hypothesis, we incorporated cor- 
tical dynamics in the 3D vertex model. We focused on actin as the 
main cortical component, although cortical depletion could also affect 
actin cross-linkers, polymerization agents and molecular motors. In 
our model, cortical thickness—or, equivalently, cortical surface density 
p—is determined by a balance between polymerization at the plasma 
membrane and depolymerization in the bulk of the actin gel”*. If the 
availability of cytoskeleton components ready for polymerization is 
infinite, this model predicts that cortical density p, and hence corti- 
cal tension +, are constant and independent of strain, which leads to 
equation (1). However, if free cytoskeleton components are limited, 
the model predicts a progressive depletion of cortical density with 
cellular areal strain—and hence strain softening when the cortex 
becomes sufficiently thin”” (Fig. 4c). To test this physical mechanism, 
we measured cortical surface density p in cells located at the apex of 
fixed domes and represented this surface density as a function of cell 
strain €,. These experiments showed that superstretched cells system- 
atically exhibited less-dense cortices (Fig. 4d—g, Extended Data Fig. 6). 
Moreover, live imaging of cells labelled with SiR-actin showed that 
the actin cortex became progressively and reversibly diluted with cell 
stretching (Fig. 4h-j, Supplementary Video 8). 

We exogenously interfered with cell-cell junctions and the actin 
cytoskeleton (Extended Data Figs. 2, 7). Notably, we locally triggered 
actin depolymerization using a photoactivatable derivative of cytoch- 
alasin D. Upon activation, targeted cells increased their area without 
noticeable changes in the overall shape of the dome (Extended Data 
Fig. 8, Supplementary Video 9), which indicates that cortical dilution 
is sufficient to cause large increases in cell area. Taken together, these 
results are consistent with our hypothesis that cortical dilution under- 
lies cellular superstretching. 

Besides strain softening, superelasticity also requires re-stiffening 
at large strains to confine the high-strain phase. Without such a mech- 
anism, the first cell to reach the softening regime would easily deform 
further, relaxing neighbouring cells and eventually localizing deforma- 
tion in an unbounded fashion** (Supplementary Note 3, Supplementary 
Video 10). Multiple mechanisms could stiffen cells that are subjected 
to extreme stretching, including exhaustion of the plasma membrane 
reservoir’, crowding of adhesion molecules in shrinking cell-cell 
adhesions”, confinement of the nucleus between tensed cortices or 
load transfer to the otherwise-relaxed intermediate filament cytoskel- 
eton®. Our experiments do not rule out any of these possibilities but do 
provide support for the last. Indeed, intermediate filaments in super- 
stretched cells appeared unusually straight, which suggests that these 
filaments are load-bearing (Fig. 4l-n, Extended Data Fig. 9). To fur- 
ther test this mechanism, we laser-ablated keratin-18 filaments. In 
weakly stretched cells, laser ablation did not induce changes in cell 
area. By contrast, laser ablation in superstretched cells resulted in a 
rapid increase in cell area, indicating that intermediate filaments in 
superstretched cells—but not in relaxed cells—bear tension (Fig. 40, p, 
Extended Data Fig. 10). By introducing re-stiffening at large strains 
into our computational vertex model, we were able to recapitulate our 
most-salient experimental observations (Supplementary Videos 11, 12). 
At low levels of dome stretching, tissue tension increased with strain, 
and heterogeneity in cellular strain was low (Fig. 4k, q, r). By contrast, 
at high levels of stretching, the domes reached a tensional plateau and 
heterogeneity in cellular strain rose sharply. Thus, strain softening 
by stretch-induced depletion of cortical components followed by re- 
stiffening at extreme stretches configures an effective bistable energy 
landscape of active origin that explains the emergence of a stable 
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high-strain phase of superstretched cells under sufficiently large ten- 
sion (Fig. 4r, Supplementary Note 3). 

Active superelasticity provides a mechanism for epithelial tissues to 
undergo extreme and reversible deformations at nearly constant tension 
by progressive switching of individual cells to a superstretched state. 
Our study suggests that, because the underlying subcellular mecha- 
nisms are generic, superelasticity may have a broad applicability in vivo. 
For example, epithelial superelasticity may mediate the spreading of 
superstretched extra-embryonic tissues and their subsequent rapid 
compaction’’. Active superelasticity may also enable extreme cellu- 
lar strains in the trophectoderm during the swelling and hatching of 
mammalian blastocysts'*'’. Besides providing a framework to under- 
stand epithelial mechanics and morphogenesis in vivo, the material 
laws established here set the stage for a rational manipulation of cell 
monolayers in organoids and organ-on-a-chip technologies”®. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and investigators were not blinded to allocation during 
experiments and outcome assessment. 

Fabrication of soft silicone gels. Soft elastomeric silicone gels were prepared using 
a protocol based on previous publications””-*”. In brief, a silicone elastomer was 
synthesized by mixing a 1:1 weight ratio of CY52-276A and CY52-276B poly- 
dimethylsiloxane (Dow Corning Toray). After degassing for 5 min, the gel was 
spin-coated on glass-bottom dishes (35-mm, no. 0 coverslip thickness, Mattek) 
for 90 s at 400 r.p.m. The samples were then cured at 80°C for 1 h. The substrates 
were kept in a clean, dust-free and dry environment and they were always used 
within 4 weeks of fabrication. 

Coating the soft PDMS substrate with fluorescent beads. After curing, the soft 
PDMS was treated with (3-aminopropyl)triethoxysilane (APTES, Sigma-Aldrich, 
cat. no. A3648) diluted at 5% in absolute ethanol for 3 min, rinsed 3 times with 
ethanol 96%, and dried in the oven for 30 min at 60°C. Samples were incubated 
for 5 min with a filtered (220 nm) and sonicated solution of 200-nm-diameter 
red fluorescent carboxylate-modified beads (FluoSpheres, Invitrogen) in sodium 
tetraborate (3.8 mg/ml, Sigma-Aldrich), boric acid (5 mg/ml, Sigma-Aldrich) 
and 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC, 0.1 mg/ml, Sigma- 
Aldrich), as previously described”’. Next, gels were rinsed 3 times with type-1 water 
and dried in the oven for 15 min at 60°C. Beads were passivated by incubating 
the samples with tris-buffered saline (TBS, Sigma-Aldrich) solution for 20 min at 
room temperature. Finally, substrates were rinsed again 3 times with type-1 water 
and dried in the oven for 15 min at 60°C. 

Soft PDMS stiffness measurements. Gel stiffness was measured by indenting the 
gel with a large metal sphere (diameter, 1,000,1m) of known mass. The indentation 
caused by the weight of the sphere was determined using confocal microscopy. 
From the measured indentation and sphere mass, we obtained Young’s modulus 
by applying Hertz theory, corrected for the finite thickness of the gel**. We found 
a Young’s modulus of 2.9 + 0.5 kPa (mean +s.d., n= 6), in good agreement with 
published data??-32439, 

Cell patterning on soft PDMS. PDMS patterning stamps were incubated with a 
fibronectin solution at 40 j1g/ml (fibronectin from human plasma, Sigma-Aldrich) 
for 1 h. Next, the protein was transferred to poly vinyl alcohol (PVA, Sigma- 
Aldrich) membranes which were then placed in contact with the gel surface for 1 h. 
Membranes were dissolved and the surface was passivated at the same time using 
Pluronic F127 (Sigma-Aldrich) 0.2% w/v overnight at 4°C. Afterwards, the soft 
silicone gels were washed with phosphate-buffered saline (PBS, Sigma-Aldrich) 
and incubated with cell culture medium for 30 min. For cell seeding, the culture 
medium was removed and a 70-11 drop containing ~150,000 cells was placed on 
the soft PDMS. Fifty minutes after seeding, the unattached cells were washed away 
using PBS and more medium was added. Cells were seeded at least 48 h before 
experiments. 

PDMS patterning stamps. PDMS (Sylgard, Dow Corning) stamps for micropat- 
terning were fabricated. In brief, SU8-50 masters containing cylinders that were 
80 1m or 100,1m in diameter were raised using conventional photolithography. 
Uncured PDMS was poured on the masters and cured for 2 h at 65°C. PDMS 
was then peeled off from the master and kept at room temperature in a clean and 
dust-free environment until use. 

Three-dimensional traction microscopy. Three-dimensional traction forces were 
computed using traction microscopy with finite gel thickness**?”. To account for 
both geometrical and material nonlinearities, a finite element method (FEM) solu- 
tion was implemented. Confocal stacks of the fluorescent beads covering the gel 
surface were taken with z-step = 0.3 1m and total depth of 15j1m. A 3D displace- 
ment field of the top layer of the gel between any experimental time point and its 
relative reference image (obtained after cell trypsinization) were computed using 
home-made particle imaging velocimetry software based on an iterative algorithm 
with a dynamic interrogation window size and implementing convergence criteria 
based on image intensity as described in previous publications**. Results for the 
normal traction inside the dome were compared to analytical solutions for a liquid 
droplet over an elastic substrate with finite thickness*?“”. 

Cell culture. MDCK strain II and Caco-2 cells were used. To visualize specific cell 
structures, the following stable fluorescent cell lines were used: MDCK expressing 
LifeAct-GFP (MDCK-LifeAct) to visualize the actin cytoskeleton, MDCK express- 
ing CIBN-GFP-CAAX to visualize the plasma membrane (MDCK-CAAX), 
MDCK expressing keratin-18-GFP (MDCK-K18) to visualize intermediate fila- 
ments. All MDCK lines were cultured in minimum essential medium with Earle’s 
Salts and L-glutamine (Gibco) supplemented with 10% v/v fetal bovine serum 
(FBS; Gibco), 100j1g/ml penicillin and 100|1g/ml streptomycin. Selection antibiotic 
geneticin (Thermo Fisher Scientific) was added at 0.5 mg/ml to LifeAct stable cell 
lines. Cells were maintained at 37°C in a humidified atmosphere with 5% COp. 
Live imaging of F-actin was performed by incubating cells (12 h, 100 nM) using 
live cell fluorogenic F-actin labelling probe (SiR-actin, Spirochrome). Caco-2 cells 
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were imaged using BODIPY FL C16 dye (11M, 1 h incubation, Thermo Fisher 
Scientific). MDCK-LifeAct cells were obtained from the laboratory of B. Ladoux. 
MDCK keratin-18-GFP cells were obtained from the laboratory of G. Charras. 
MDCK-CAAX cells were obtained by viral infection of CIBN-GFP-CAAX. 
Caco-2 cells were bought from Sigma Aldrich (86010202). Cell lines tested nega- 
tive for mycoplasma contamination. All MDCK cell lines were authenticated by the 
laboratories that provided them. Caco-2 cells were authenticated by the provider 
(Sigma Aldrich, from the ECACC). 

Pharmacological interventions and osmotic shocks. To perturb actomyosin 
contractility, cells were treated with Rho kinase inhibitor Y-27632 (InSolution 
Calbiochem, Merck- Millipore, 30|1.M, 5-min incubation). To inhibit ARP2/3 
complex, cells were treated with CK666 (Sigma Aldrich, 1001M, 1-h incuba- 
tion). To perturb the osmolarity, p-mannitol (Sigma-Aldrich, final concentration 
100 mM) was added to the medium. To weaken cell-cell junctions, EGTA 
(Sigma-Aldrich, final concentration 2 mM, 30-min incubation) was added to 
the medium. 

Cell immunofluorescence. MDCK cells were fixed with 4% paraformaldehyde in 
PBS for 10 min at room temperature and permeabilized using 0.1% Triton X100 
(Sigma-Aldrich) in PBS for 10 min at room temperature. Cells were blocked in 1% 
bovine serum albumin (BSA, Sigma-Aldrich) in PBS for 1 h (at room temperature). 
Phalloidin (Alexa Fluor 555 phalloidin, Thermo Fisher Scientific) was then added 
at 1:1,000 dilution in PBS and incubated for 30 min at room temperature. To iden- 
tify nuclei, cells were then incubated for 10 min in a Hoechst solution (Hoechst 
33342, Thermo Fisher Scientific) at 1:2,500 dilution in PBS. Images were acquired 
with a spinning disk confocal microscope using a Nikon 60x oil 1.4 numerical 
aperture (NA) lens. 

Time-lapse microscopy. Multidimensional acquisition for traction force meas- 
urements was performed using an inverted Nikon microscope with a spinning 
disk confocal unit (CSU-W1, Yokogawa), Zyla sCMOS camera (Andor, image size 
2,048 x 2,048 pixels) using a Nikon 40 x 0.75 NA air lens. The microscope was 
equipped with temperature control and CO) control, using Andor iQ3 or Micro- 
Manager software’. 

Laser ablation. The set-up used has previously been described“. In brief, MDCK 
keratin-18-GFP cells were cultured on thin PDMS micropatterned substrates and 
allowed to form domes. We then used a sub-nanosecond ultraviolet pulsed laser 
to ablate individual filament bundles in weakly stretched and superstretched cells. 
Immediately after ablation we monitored the time evolution of keratin filaments, 
and we obtained bright-field images of the domes to measure cell area. Experiments 
were performed at 37 °C and 5% COp. 

Photoactivatable cytochalasin D. We used a phototriggerable derivative of cyto- 
chalasin D that includes a nitroveratryloxycarbonyl photoremovable group located 
at the hydroxyl group at C7 of cytochalasin D. Attachment of the chromophore 
renders cytochalasin D temporarily inactive. Upon light exposure, cytochalasin D 
becomes active and causes local depolymerization of the actin cytoskeleton. For 
experiments, MDCK-CAAX domes were incubated with SiR-actin to visualize 
the cortical cytoskeleton. Individual cells were illuminated with a 405-nm laser to 
activate cytochalasin D. After the pulse, the cell area and actin cytoskeleton were 
visualized using time-lapse microscopy (63 x oil, Zeiss LSM 880). 

Image analysis. Fiji software was used to perform the image analysis*’. The pair- 
wise stitching plugin was used to create 3D montages, the maximum intensity 
z-projection and the sum-slices z-projection were used where appropriate. Actual 
cell areas were computed from z-projections using the methodology described in 
Supplementary Note 4. 

Animals. Animal care and experiments were carried out according to protocols 
approved by the Ethics Committee on Animal Research of the Science Park of 
Barcelona (PCB), Spain (Protocol number 7436). Outbred B6CBAF1/JRj mice 
(male and females of 5-6 weeks of age) were obtained from Janvier Labs. Mice 
were kept in a 12 h light:dark cycle (lights on 07:00-19:00) with ad libitum access 
to food and water. 

Embryo collection and in vitro culture. For embryo collection, superovulation 
was induced in B6CBAF1/JRj female mice by intraperitoneal injection of 7.5 LU. 
of pregnant mare serum gonadotropin (PMSG), followed—after 48 h—by 
7.5 1.U. of human chorionic gonadotropin (hCG). Superovulated females were then 
paired with male mice, and subsequently euthanized by cervical dislocation 20 h 
after hCG injection. Then, one-cell stage embryos (zygotes) were collected from 
the excised oviducts into medium containing 0.1% (w/v) hyaluronidase (Sigma) 
to remove cumulus cells under a dissection microscope. Recovered zygotes were 
cultured in micro-droplets of culture medium covered with mineral oil at 37°C and 
5% CO), until the blastocyst stage. No randomization nor blinding were performed 
as experiments did not involve comparisons between groups. Experiments were 
reproduced four times. 

Blastocyst immunofluorescence. Blastocysts at different degrees of development 
were fixed with 4% paraformaldehyde (Aname) for 20 min at room temperature. 
Then, fixed blastocysts were washed three times with PBS containing 1% bovine 
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serum albumin (Sigma), 2% goat serum (Sigma) and 0.01% Triton X-100 (Sigma), 
referred to as blocking buffer. Next, blastocysts were permeabilized with 2.5% 
Triton X-100 (Sigma) in PBS for 30 min at room temperature and subsequently 
washed three times with blocking buffer. Blastocysts were incubated overnight at 
4°C in anti-E-cadherin primary antibody (610181, BD Biosciences), diluted 1:50 in 
blocking buffer. The following day, blastocysts were washed three times with block- 
ing buffer and incubated for 90 min at 37°C in Alexa Fluor (A) 488-conjugated 
secondary antibody (A21202, Thermo Fisher), diluted 1:200 in blocking buffer. 
Nuclei were counterstained with DAPI (D1306, Life Technologies) for 30 min. 
Image acquisition was performed in a SP5 Leica microscope or a Zeiss LSM780 
confocal microscope using a plan-apochromat 40x oil DIC M27 objective. 
Code availability. MATLAB analysis procedures are available from the corre- 
sponding authors on reasonable request. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 
The data that support the findings of this study are available from the correspond- 
ing authors on reasonable request. 
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Extended Data Fig. 2 | Dome response to inhibition of tension and 
weakening of cell-cell adhesion. a, Time evolution of surface tension and 
volume of a representative dome in response to Y27632 (301M, added 

at t=0 min). b, Cellular areal strain ¢, as a function of dome nominal 
areal strain eq during dome swelling. Only a subset of cells is represented 
and most cells with ¢. < eg have been omitted for clarity. Coloured lines 
represent the cells labelled in c. Dashed line represents the relation ¢.=€4. 
The inset represents the variance of ¢. within the dome as a function of eq. 
c, Maximum intensity projection and x-z and y-z confocal sections of an 
epithelial dome of MDCK-CAAX cells before (—1 min) and after (12 min 
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and 26 min) addition of Y27632 (30\.M, added at t=0 min). The time 
evolution of coloured cells is depicted in b using the same colour code. 
Scale bars, 50m. Data are representative of n =3 experiments. 

d, Maximum intensity projection and corresponding x-z and y-z profiles, 
showing the collapse of a dome of MDCK-CAAX cells after treatment 
with 2 mM EGTA (30 min and 35 min after the addition of EGTA). Data 
are representative of n =3 experiments. Scale bar, 501m. e, After dome 
collapse, gaps (red arrowheads) were apparent at tricellular junctions. 
Scale bar, 10m. 
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Extended Data Fig. 3 | Dome volume dynamics during spontaneous sections of domes during these experiments. Data representative of n= 10 
fluctuations. a, c, Time evolution of the dome volume in experiments experiments. Scale bars, 501m. 
that last 12 h (a) and 6 h (c). Cells are MDCK-LifeAct. b, d, Confocal x-z 
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Extended Data Fig. 4 | Tension-strain relations in MDCK-CAAX 

and Caco? cells. a, Relation between surface tension and areal strain for 
MDCK-CAAX cells. Data include measurements at different time points 
from n=9 domes. The tension-strain relation is qualitatively similar to 
the one obtained for MDCK-LifeAct cells (Fig. 3e), with small quantitative 
differences. The solid line and shaded area indicate the mean + s.d. 
obtained by binning the data (n = 14 points per bin). b, Image of a 
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representative Caco2-cell dome labelled with BODIPY FL C16 dye (n=3 
micropatterned substrates). Confocal x-y, x-z and y-z sections are shown. 
Scale bar, 50j1m. c, Relation between surface tension and areal strain for 
Caco2 cells. Data include measurements at different time points from 

n=6 domes. Caco? cells show a tensional plateau throughout the probed 
strain range. The solid line and shaded area indicate the mean + s.d. 
obtained by binning the data (n = 10 points per bin). 
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Extended Data Fig. 5 | See next page for caption. 
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Extended Data Fig. 5 | Dome cells exhibit large strain heterogeneity. 

a, Cellular areal strain ¢, as a function of dome nominal areal strain eq 
during dome swelling. Only a subset of cells is represented and most cells 
with €, < €q have been omitted for clarity. Coloured lines represent the 
cells labelled in b. Dashed line represents the relation ¢. = €q. The inset 
represents the variance of €, within the dome as a function of €4. 

b, Maximum intensity projection of an epithelial dome of MDCK-CAAX 
cells at four different time points of the swelling event described in a. The 
time evolution of coloured cells is depicted in a using the same colour 
code. Scale bars, 501m. c, d, represent the same as a, b, for a different 
dome of MDCK-CAAX cells during slow deflation. e, Coefficient 
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of variation (CV) (defined as standard deviation divided by mean) 

of MDCK-CAAX cells in a 2D adherent cell monolayer, in weakly 
inflated domes (20-100% areal strain), and in highly inflated domes 
(100-150%). The coefficient of variation is a non-dimensional indicator 
of heterogeneity. The coefficient of variation was calculated by measuring 
area of 10 cells in n=7 cell monolayers, n =7 weakly inflated domes and 
n=7 highly inflated domes. **P= 0.0041 (left), **P = 0.0041 (right), 
two-tailed Mann-Whitney test. Data are shown as mean + s.d. f, g, Mouse 
blastocysts (labelled with E-cadherin) exhibiting heterogeneity in cell area 
in the trophectoderm, particularly during hatching (g) (n = 4). Scale bars, 
25m. 
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Stone ls | 
Extended Data Fig. 6 | Superstretched cells display a lower density of F-actin at the cortical surface. a-f, Sum of intensity projection of epithelial 
domes stained with phalloidin for F-actin. n=5. Scale bars, 50m. 
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Extended Data Fig. 7 | Inhibition of ARP2/3 does not affect area Two-tailed Mann-Whitney test. Data are shown as mean +s.d. b, Dome 


heterogeneity in domes of MDCK cells. a, Coefficient of variation of the nominal areal strain in domes of MDCK-CAAX cells, treated with CK666 
cell area in domes of MDCK-CAAX cells, treated with CK666 (100 1M (100M for 60 min, n= 6), compared to control domes (n = 14). NS, not 
for 60 min), compared to control domes. The coefficient of variation is a significant (P = 0.7043). Two-tailed Mann-Whitney test. Data are shown 
non-dimensional indicator of heterogeneity. The coefficient of variation as mean + s.d. c, Maximum intensity projections and x-z sections of a 
was calculated by measuring area of 10 cells in n=6 domes treated with representative control dome (left) and the same dome treated with CK666 
CK666 and in n= 14 control domes. NS, not significant (P = 0.1256). 100 1M (60 min). Scale bar, 251m. 
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Extended Data Fig. 8 | See next page for caption. 
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t=21 min 


Extended Data Fig. 8 | Local perturbation of the actin cortex using 
photoactivatable cytochalasin D increases cell area. a, Time evolution 
of the normalized cell area in response to local photoactivation of 
cytochalasin D (black line, activation at t=0 min, n=5 domes; 

see Methods). The blue line shows the time evolution of control cells 
(same illumination protocol but no photoactivatable cytochalasin D in 

the medium, n = 8 domes). The area was normalized to the first time 
point. Solid lines and shaded areas indicate mean + s.d. At t=21 min, 
normalized cell areas were significantly different (*P = 0.0159, two-tailed 
Mann-Whitney test). b, Normalized cell area 21 min after photoactivation 
in three experimental conditions: photoactivated cells (black circles, n= 19 
cells from 5 domes), cells subjected to the same illumination protocol but 
without photoactivatable cytochalasin D in the medium (blue squares, 

n= 19 cells from 8 domes) and cells with photoactivatable cytochalasin D 
in the medium but without illumination (red triangles, n = 24 cells from 
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9 domes). Data include the immediate neighbours of the targeted cells 
because cytochalasin D quickly diffused after activation. ****P < 0.0001, 
NS, not significant (P = 0.4130), two-tailed Mann-Whitney test. Data 

are shown as mean + s.d. c, Representative photoactivation experiments 
showing the apex of one dome before (— 12 min) and after (6 min and 

21 min) photoactivation of the cell marked with a yellow dashed rectangle 
(n=5). Top panels show the fluorescently labelled membrane and bottom 
panels show the SiR-actin channel. Note the increase in cell area and 
granulation in the SiR-actin channel (white arrowheads), which indicates 
disruption of the actin cortex. Scale bar, 151m. d, Control experiment in 
which one cell at the apex of the dome (yellow dashed line) was subjected 
to the illumination protocol of ¢ without photoactivatable cytochalasin 

D in the medium (n =8). Top panels show the fluorescently labelled 
membrane and bottom panels show the SiR-actin channel. Scale bar, 
151m. See also Supplementary Video 9. 
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a b Cc 


Extended Data Fig. 9 | Intermediate filaments reorganize in that the keratin-18 filament network links neighbouring cells and localizes 
superstretched cells. a—f, Immunofluorescence micrographs at cell boundaries (white arrowheads). Scale bars, 10 um. ¢, f, Zoomed-in 
(see Methods)—represented using maximum intensity projection—of area (marked with a dashed white square in b, e) showing that keratin-18 
domes of MDCK keratin-18-GFP (in green) cells stained for F-actin filaments are taut (white arrowheads) and have reorganized, with nodes at 
(phalloidin, red), and nuclei (Hoechst, blue), n = 3. Scale bars, 50 pm. the cell centre connecting different cells. Scale bars, 101m. 


a, d, Zoomed-in area (marked with a dashed white square in b, e) showing 
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Extended Data Fig. 10 | Intermediate filaments stabilize cell shape keratin-18-GFP weakly stretched cell at the apex of a dome before (0 s) 

in superstretched cells. a, Representative MDCK keratin-18-GFP and after (90 s) laser cutting the keratin filament bundle shown in d. 
superstretched cell at the apex of a dome before (0 s) and after (90 s) laser The yellow line marks the outline of the cell measured with bright-field 
cutting the keratin filament bundle marked in b with a white arrowhead. imaging. Scale bar, 10|1m. d, Magnified view of the region labelled in c 
The yellow line marks the outline of the cell measured with bright-field with a dotted magenta rectangle. The same laser cutting protocol and laser 
imaging. Scale bar, 101m. b, Magnified view of the region labelled ina power were used to cut filaments in superstretched and weakly stretched 


with a dotted magenta rectangle. Scale bar, 541m. c, Representative MDCK __ cells. n=5. Scale bar, 5 um. See Fig. 40, p for quantification and statistics. 
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Tc toxin activation requires unfolding 
and refolding of a 3-propeller 


Christos Gatsogiannis!*, Felipe Merino’, Daniel Roderer!?, David Balchin?, Evelyn Schubert!, Anne Kuhleel, 


Manajit Hayer-Hartl? & Stefan Raunser!* 


Tc toxins secrete toxic enzymes into host cells using a unique syringe-like injection mechanism. They are composed of 
three subunits, TcA, TcB and TcC. TcA forms the translocation channel and the TcB-TcC heterodimer functions as a cocoon 
that shields the toxic enzyme. Binding of the cocoon to the channel triggers opening of the cocoon and translocation 
of the toxic enzyme into the channel. Here we show in atomic detail how the assembly of the three components activates 
the toxin. We find that part of the cocoon completely unfolds and refolds into an alternative conformation upon 
binding. The presence of the toxic enzyme inside the cocoon is essential for its subnanomolar binding affinity for the TcA 
subunit. The enzyme passes through a narrow negatively charged constriction site inside the cocoon, probably acting as 
an extruder that releases the unfolded protein with its C terminus first into the translocation channel. 


Tc toxins are found in pathogenic bacteria that affect insects and 
humans! Tc toxins of insect pathogens are potential biopesticides and 
therefore the focus of crop protection research”, and understanding 
the mechanism of action of Tc toxins of human pathogens is medically 
relevant**. A Photorhabdus toxin complex (Tc) is typically composed 
of three proteins, TcA, TcB and TcC (Fig. 1a). TcB and TcC form a 
heterodimeric cocoon of about 300 kDa. The C-terminal hypervaria- 
ble region (HVR) of TcC! is autoproteolytically cleaved, generating a 
toxic enzyme of about 30 kDa®’. The HVR varies greatly in sequence 
among different TcC homologues; the enzymes therefore have diverse 
toxic activities. In the case of TccC3, a TcC protein from Photorhabdus 
Iuminescens, the enzyme functions as an ADP-ribosyltransferase, which 
post-translationally modifies actin, leading to intracellular actin aggre- 
gation and cell death’. The enzyme is not resolved in the crystal struc- 
tures of either TccC3 or its Yersinia entomophaga homologue YenC2°”, 
suggesting that it is at least partially unfolded inside the cocoon. 

TcA isa 1.4-MDa protein, which forms a translocation channel that is 
shielded by a shell®”. A shift to higher or lower pH opens an electrostatic 
lock at the bottom of the shell, triggering its structural rearrangement®"®. 
The translocation channel is released and the compaction of a stretched 
linker that connects channel and shell drives the membrane insertion 
of TcA%"°. The anchoring of TcA on the membrane and the necessary 
counterforce for insertion are likely to be provided by binding to one or 
more receptors, which remain unidentified. Once inside the membrane, 
conformational changes result in the opening of the channel’. 

Structural studies of the wild-type ABC holotoxin (ABC(WT)) from 
the P luminescens strain W14, comprising TcA (TcedA1), TcB (TcdB2) 
and TcC (TecC3), have demonstrated that binding of TcB-TcC to 
TcA induces the opening of a gate formed by a distorted six-bladed 
8-propeller at the bottom of TcB-TcC®. However, the mechanism of 
gate control and opening remain unknown. We previously hypoth- 
esized that after gate opening the ADP-ribosyltransferase inside the 
TcB-TcC cocoon is translocated into the TcA channel, and ultimately 
released into the host cell®. However, clear densities corresponding to 
the 3-propeller and ADP-ribosyltransferase were missing in the elec- 
tron cryo-microscopy (cryo-EM) map of the holotoxin, limiting our 
understanding of the opening and initial translocation event. 


Here we present two near-atomic cryo-EM structures of a complete 
Tc holotoxin complex, which reveal the precise mechanism of Tc toxin 
assembly, gate opening and release of the cytotoxic enzyme into the 
translocation channel. 
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Fig. 1 | Cryo-EM structure of the ABC holotoxin. a, Side view of the 

3D reconstruction of the ABC holotoxin complex (TcA (coloured by 
subunits), TcB (blue) and TcC (purple)). b, c, Side and top views of the 
closed (b, RCSB Protein Data Bank code (PDB) 409X°) and open (c) state 
of the B-propeller domain of TcB. Blades 1, 2, 5 and 6 (salmon), blades 3 
and 4 (gatekeeper domain, blue and purple), gatekeeper hairpin residues 
(residues 514-524, red), the sensor loop (residues 527-536, orange) and 
the TcB-binding domain of TcA (green) are highlighted. The TcB-TcC 
cocoon is coloured according to a. The 8-hairpin 537-546 (hinge hairpin), 
next to the TcB sensor loop, opens by 90°. 
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Fig. 2 | Transition between two states of the TcB B-propeller. a, c, 2D 
histogram of the states observed in the trajectories starting from open free 
TcB-TcC (a) or closed TcA-bound TcB-TcC (c). Lines indicate the typical 
sequence of conformational changes. b, d, Representative structures for 


Structure of the ABC holotoxin complex 

To understand how TcA, TcB and TcC assemble, we solved the struc- 
ture of ABC(WT) by cryo-EM and single-particle analysis using 
SPHIRE!! (Fig. 1a, Extended Data Fig. la-f, Supplementary Table 1, 
Supplementary Video 1). 

The TcA-TcB interface is formed by the B-propeller domain of TcB 
(TcA-binding domain) and the funnel-shaped TcB-binding domain 
of TcA (Fig. la, c). Whereas the conformation of TcA does not change 
upon binding (root mean square deviation (r.m.s.d.) 0.780 A relative to 
free TcA), the 3-propeller of TcB differs considerably from its unbound 
counterpart (Fig. 1b, c). The six blades of the 8-propeller are ordered 
and adopt a pseudo-six-fold symmetry. The resulting inner diameter 
matches exactly the diameter of the channel of TcA, thereby forming a 
continuous passage (Fig. 1c). Binding of TcB to a five-fold symmetric 
TcA creates a symmetry mismatch, resulting in different interfaces at 
each B-propeller blade (Extended Data Fig. 2a). Similar symmetry mis- 
matches have been identified in the proteases ClpAP and ClpXP!”"3, 
the 26S-proteasome"4, the BpA-20S-proteasome complex" and the tails 
of many bacteriophages!®. Notably, similar to the proteasome case TcB- 
TcC adopts a tilted orientation, about 30° relative to the axis of symme- 
try of TcA. The tilted binding results in a large interface with high shape 
complementarity (Sc) (Sc =0.658), comparable to that between protein 
antigens and antibodies (Sc =0.66)!”. The extensive interface explains 
the very high affinity between TcA and TcB-TcC (dissociation constant, 
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hairpin Blade 4 Sensor loop 


TcB 


the states highlighted in a and c. Blades 1, 2, 5, 6 are coloured salmon, 
and blades 3 and 4 (gatekeeper domain) in shades of blue. The sensor 
loop, gatekeeper and hinge hairpin are shown in orange, red and yellow, 
respectively. In d, the TcB-binding domain of TcA is shown in grey. 


Kp=1.46 £0.05 x 107! M) (Extended Data Fig. 3a, b), which is about 
tenfold higher than previously estimated!*. 

In three TcA protomers, the conserved L2422 forms hydrophobic 
interactions with the B-propeller blades 3, 4 and 5 (Extended Data 
Fig. 2a). In each case, L2422 is positioned within a hydrophobic 
groove resembling a plug-and-socket interaction. The L2422E muta- 
tion reduced the affinity for the cocoon more than fiftyfold (Extended 
Data Fig. 3a, e, f, g), demonstrating the importance of this residue. 

The positively charged residues R485 (blade 3), K534 (blade 4), R554 
and R500 (blade 5) interact electrostatically with negatively charged 
patches on TcA (Extended Data Fig. 2a). However, most of these inter- 
actions are not conserved (Extended Data Fig. 2b, c). For the interfaces 
with blades 1 and 2, we could only identify putative hydrogen-bond 
interactions (Extended Data Fig. 2a). 


Conformational changes in the open 8-propeller 

Structural alignment of free and bound TcB-TcC reveals that blades 
1, 2, 5 and 6 of the 8-propeller remain unchanged following binding 
(Fig. 1b, c). Conversely, blades 3 and 4 exhibit large conformational 
changes. In the unbound state, they are distorted, with a B-hairpin from 
blade 4 (residues 514-524) sealing the cocoon’ (Fig. 1b). Because the 
rearrangement of blades 3 and 4 results in the opening of the gate, 
we call them the gatekeeper domain, and call the B-hairpin of blade 4 
the gatekeeper hairpin. A hairpin connecting blades 4 and 5 (residues 
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Fig. 3 | ADP-ribosyltransferase in ABC(WT) and ABC(D651A). 

a, b, Three-dimensional reconstructions of ABC(WT) (a) and 
ABC(D6514A) (b) with transparent surface. The density corresponding 

to the ADP-ribosyltransferase (light grey) is shown at lower threshold. 

c, Residues facing the interior of the channel at the constriction site. 

d, Cryo-EM map with the atomic model, highlighting the autoprotease 
site of ABC(WT) (top) and ABC(D651A) (bottom). In ABC(D6514A) only, 
density is apparent beyond the cleavage site (indicated by an asterisk); the 
first residues of the toxic domain (dashed line) can be traced. 


537-546) opens by 90° to allow the conformational changes in the gate- 
keeper domain (Fig. 1b, c); we therefore call this the hinge hairpin. 
Gate opening is likely to be triggered by the clash of two loops, 
residues 2418-2430 in TcA and residues 527-536 in TcB (Extended 
Data Fig. 4a)—we call these sensor loops. The sensor loop of TcA 
does not change its conformation upon TcB binding. By contrast, the 
sensor loop of TcB undergoes a large conformational change; together, 
the hinge and gatekeeper hairpins and the TcB sensor loop form the 
6B-sheet of blade 4 in the open 3-propeller conformation (Fig. 1c). 
Because this loop connects blade 5 with the gatekeeper hairpin 
in blade 4 (Fig. 1b, c, Extended Data Fig. 4a) the initial clash can 
destabilize blade 4. To test this, we created several mutants, including 
sensor-loop deletions and the point mutations L2422E (TcA) and 
F532A or D530A/F532A (TcB). All but the TcA deletions (Extended 
Data Fig. 5a) could be expressed and purified. The affinity of all 
mutants is decreased by factors of between 3 and 50 compared to the 
wild type (Extended Data Fig. 3e, 5b-f). In spite of the lower affinity, 
holotoxins of TcA and all four TcB-TcC variants could be formed, with 
the point mutations being as toxic as TcB-TcC(WT) (Extended Data 
Fig. 5g—i). Deletion of the TcB sensor loop resulted in significantly 
decreased cytotoxicity even at a tenfold-higher toxin concentration 
(Extended Data Fig. 5g). This suggests that even though the holo- 
toxin can be formed the gatekeeper domain does not switch to the 
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open conformation, owing to the absence of the TcB sensor loop. As a 
consequence, translocation of the toxic component from the cocoon 
to the TcA channel is blocked. 

Further, combining the TcB-TcC mutants with the TcA mutant 
L2422E resulted in a near-complete loss of holotoxin formation 
(Extended Data Fig. 5h, i), which demonstrates the importance of the 
sensor loops for complex formation. 


8-Propeller refolding 

To further explore the gate-opening mechanism of TcB, we performed 
molecular dynamics simulations of free TcB-TcC in its closed (2.1 1s) 
and open (4.2 ts) B-propeller conformations. The simulations show 
that the TcB-TcC cocoon is stable in both conformations. However, 
the 8-propeller—particularly around blade 3—is more dynamic in the 
open state (Extended Data Fig. 6a-f), suggesting that TcA stabilizes the 
open conformation of the B-propeller. 

To further investigate the conformational change of the 8-propeller, 
we performed molecular dynamics simulations using a structure-based 
model that included both states simultaneously (see Methods). 
Simulations that start from a closed 8-propeller never fully transi- 
tion to the open state (Extended Data Fig. 7a, b); on rare occasions 
blade 3 briefly unfolds before refolding back to the closed conforma- 
tion (Extended Data Fig. 7b). This suggests that blade 3 is the most 
unstable region of the $-propeller. Notably, in many simulations that 
start from the open state, the gatekeeper domain quickly unfolds and 
later slowly adopts the closed conformation (Fig. 2a, b, Extended Data 
Fig. 7a, b, Supplementary Video 2). In all those trajectories, blade 3 
unfolds first followed by blade 4, with refolding occurring in the reverse 
order (Fig. 2a, b). 

In addition, we performed structure-based model simulations of 
the holotoxin to explore the effects of TcA on the 8-propeller. 
Simulations that start from the open conformation maintain the same 
state (Extended Data Fig. 7c, d), demonstrating that TcA stabilizes 
the open 6-propeller. However, simulations that start from the closed 
state never fully transition to the open conformation—the sensor-loop 
clash results in partial unfolding of the hinge hairpin (Extended Data 
Fig. 7c, d, g)—consistent with our interpretation that this is an early 
step in the conformational change. 

We then performed simulations with a mildly destabilized 3-pro- 
peller (see Methods). Again, simulations started from the open state 
preserve their conformation (Extended Data Fig. 7e, f). By contrast, 
blade 3 quickly unfolds in the simulations started from the closed 
state, with the sensor-loop clash resulting in the unfolding of blade 4 
and subsequently the whole gatekeeper domain (Fig. 2c, d). Refolding 
occurs in a sequential manner, with blade 4 followed by blade 3 folding 
into the open state (Fig. 2c, d, Supplementary Video 3). Therefore, the 
transition occurs in the reverse order to that seen in the simulations of 
the free TcB-TcC (Fig. 2a, b). As we never observe a transition without 
unfolding, we propose that the B-propeller locally unfolds and refolds 
to switch between states. A similar refolding event switches the func- 
tion of RfaH between transcription and translation factor!?; however, 
the RfaH transition occurs through an intermediate that lies between 
both states”°. 

In the simulations, blade 3 and 4 could either be unfolded or have the 
same state as the other blades. Therefore, whereas unfolding of blade 
3 always initiates the transition, it is the relative stability of blade 4 that 
determines the final state. Therefore, in the holotoxin, the destabiliza- 
tion of blade 4 by the sensor loop pushes the protein towards the open 
conformation. A similar scenario has been proposed for haemagglu- 
tinin, in which destabilization of a small region has a key role in the 
conformational change required for membrane fusion”!. 


Initiation of translocation 

We identified additional density corresponding to the ADP- 
ribosyltransferase inside the cocoon (Fig. 3a). The density appears at 
lower threshold than the rest of the map, with the lack of recognizable 
secondary structure indicating that the ADP-ribosyltransferase is 
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flexible. The density fills almost the entire cocoon, continuing through 
the TcA translocation channel up to L2085 (Fig. 3a). 

To test whether autoproteolysis of TcC is required for the ADP- 
ribosyltransferase to enter the translocation channel, we solved the 
structure of a holotoxin with a proteolytically inactive TeC (TcA-TcB- 
TcC(D651A); referred to collectively as ABC(D651A))° (Extended 
Data Fig. 1g—], Supplementary Table 1, Supplementary Video 4). In 
this case, the density also enters the TcA translocation channel, but 
only reaches down to P2020 (Fig. 3b). Consequently, a larger part of 
the ADP-ribosyltransferase is found in the cocoon (Supplementary 
Video 5). In contrast to ABC(WT), TcC(D651A) density continues after 
the cleavage site (Fig. 3d), indicating that the ADP-ribosyltransferase is 
indeed uncleaved. Because the ADP-ribosyltransferase is attached at its 
N terminus, the protein must have moved with its C terminus first. The 
similarity of the densities of the cleaved and uncleaved enzymes (Fig. 3, 
Supplementary Video 5) further suggests that the enzyme undergoes 
the same C-to-N-terminal translocation in ABC(WT). A similar mech- 
anism has previously been proposed for diphtheria toxin”. By contrast, 
the anthrax lethal factor is translocated in an N-to-C-terminal direc- 
tion’, Similarly, protein translocation into the endoplasmic reticulum 
and other cell compartments is typically in an N-to-C-terminal direc- 
tion. However, when the signal sequence of proteins targeted to mito- 
chondria is placed at the C terminus instead of the N terminus, these 
proteins are also transported in C-to-N-terminal direction”. Our study 
shows that toxins can probably also be translocated in this non-con- 
ventional ‘backwards’ fashion. 


The TcB constriction site 

After opening, the inner diameter of the B-propeller gate measures 
11-15 A. The narrowest passage in the entire ABC complex, the 10.5 A 
constriction site, lies directly above the 6-propeller (Extended Data 
Fig. 4b, c). This constriction site is composed of a ring of polar, mostly 
negatively charged residues (Fig. 3c), with a highly conserved aspartate 
(D34) positioned at the entrance (Extended Data Fig. 8a). The ring forms 
a band of negative electrostatic potential, similar to those in the lumen 
of the TcA channel’®. Whereas there is little sequence conservation of 
the other residues at the constriction, the arrangement of charges is con- 
served among different Tc homologues (Extended Data Fig. 8). 

The small diameter of the constriction site should prevent passage 
of any tertiary structures; this is supported by the shape of the ADP- 
ribosyltransferase density (Fig. 3a, b). We used molecular dynamics 
simulations to assess whether an a-helix can nevertheless traverse 
the constriction. The simulations show that a full helix can be accom- 
modated with its secondary structure mostly preserved; hydrophobic 
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residues of the helix nestle against hydrophobic patches in TcB, result- 
ing in a peptide with partial hydration (Extended Data Fig. 9). 


ADP-ribosyltransferase required for complex assembly 
Binding experiments show that the values of Kp and association rate 
constant (ko) for TCB-TcC(D651A) binding to TcA are comparable 
to those of the wild-type complex, indicating that the impaired cleav- 
age of TcC does not influence holotoxin formation (Extended Data 
Fig. 3b, c). However, the affinity of a TcB-TceC complex without ADP- 
ribosyltransferase (empty TcB-TcC) to TcA is much lower; the Kp 
is three orders of magnitude higher, owing to a decrease of approxi- 
mately 1,000-fold in k,, (Extended Data Fig. 3d). However, the values of 
dissociation rate constant (Kor) are similar for the empty and loaded 
TcB-TcC cocoons, demonstrating that the reduced affinity is not a 
result of missing interactions between the ADP-ribosyltransferase and 
TcA. This drop in affinity could be a mechanism to ensure that only 
fully functional TcB-TcC complexes are loaded onto TcA. 

To better understand this effect, we determined the crystal struc- 
ture of empty TcB-TcC at a resolution of 3.2 A. Unexpectedly, the 
TcA-binding region was very similar to that of the wild-type complex 
(Extended Data Fig. 4b, d, e). However, this may be artificially stabi- 
lized, as the TcA-binding domain mediates some of the crystal contacts 
(Extended Data Fig. 4d, e). 

To further probe the effect of the ADP-ribosyltransferase on TcB- 
TcC, we monitored the flexibility of empty TcB-TcC, TcB-TcC(WT), 
TcB-TcC(D651A), ABC(WT) and ABC(D6514A) using hydrogen-deu- 
terium exchange with mass spectrometry (HDX-MS) (see Methods 
and Fig. 4a). Whereas the sequence coverage for the holotoxins was 
only about 50% (Source Data for Fig. 4), a part of the gatekeeper hair- 
pin (residues 514-524) (Fig. 4b) could be identified in all samples. We 
observed only low levels of deuterium incorporation for the empty TcB- 
TcC, ABC(WT) and ABC(D651A) (Fig. 4b, Extended Data Fig. 10a). By 
contrast, the TcB-TcC(D651A) complex showed substantial deuterium 
incorporation, indicative of structural destabilization, whereas TcB- 
TcC(WT) showed an intermediate level of deuterium incorporation 
(Fig. 4b, Extended Data Fig. 10a). A similar effect was observed for a 
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part of blade 4 (residues 463-471) of the 3-propeller (Extended Data 
Fig. 10b, c). For both peptides, destabilization was more pronounced 
for TcB-TcC(D6514A) (Fig. 4b, Extended Data Fig. 10c). 

Our results show that TcA selectively stabilizes the gatekeeper hairpin 
and nearby TcB-8-propeller elements, consistent with the structure of 
the complex (Fig. 1b, c, Extended Data Fig. 4a). Moreover, in contrast 
to our crystallographic data, the HDX-MS data show that the ADP- 
ribosyltransferase destabilizes the gatekeeper domain of free TcB-TcC 
(Fig. 4b, Extended Data Fig. 10a). We therefore propose that the ADP- 
ribosyltransferase applies steric ‘pressure’ on the gatekeeper domain, 
facilitating TcA binding without opening the gate in the absence of TcA. 


Conclusion 

The results of the present study enable us to describe the mechanism of 
Tc toxin activation in detail. We have directly demonstrated the pres- 
ence of a low-resolution density for the ADP-ribosyltransferase inside 
the cocoon. Therefore, as in our previous model®, we propose that the 
ADP-ribosyltransferase is partially unfolded (Fig. 5, left). We previ- 
ously showed that the bottom of the cocoon is closed by a B-hairpin 
(gatekeeper hairpin) that blocks the central opening of the 6-propeller 
(Fig. 5, left). Here we demonstrate that—without opening the cocoon— 
the ADP-ribosyltransferase influences the gatekeeper hairpin, probably 
by increasing the pressure within the cocoon, resulting in subnanomo- 
lar affinity of TcB-TcC for TcA. 

The assembly-competent cocoon docks with its distorted 3-pro- 
peller onto TcA (Fig. 5, middle), where the sensor loops of the two 
proteins clash. This destabilizes the gatekeeper domain and triggers a 
large conformational change in blades 3 and 4 of the -propeller; they 
completely unfold and refold to form a pseudo-symmetrical six-bladed 
8-propeller. During this process the gatekeeper hairpin is rearranged 
and becomes part of the 8-sheet of blade 4. The hinge hairpin, which 
is directly connected to the sensor loop, opens by 90°, allowing these 
large changes (Fig. 5, middle). Finally, this results in the opening of the 
gate, forming a continuous translocation channel with TcA. Despite 
the symmetry mismatch between TcA and TcB, the respective surfaces 
of the resulting interface are a close match. During translocation, the 
unfolded ADP-ribosyltransferase passes through a constriction site 
through which no tertiary structure can pass. Unusually, the protein 
begins translocating with its C terminus first. In the final state, the N 
terminus of the unfolded ADP-ribosyltransferase still resides within the 
TcB-TcC cocoon, whereas the C terminus extends to the centre of the 
TcA channel (Fig. 5, right). Threading into the translocation channel 
occurs spontaneously after gate opening. Further translocation is likely 
to require injection of the TcA channel into the membrane of the target 
cell and opening of the initially closed pore’”. 


Online content 
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METHODS 
Protein production. TcdA1 was expressed in BL21-CodonPlus(DE3)-RIPL in 101 
LB medium and purified as previously described”. 

Fusion proteins TcdB2-TccC3 and cleavage-deficient TedB2-TecC3(D651A)° 
were expressed as previously described!” with the following modifications. Expression 
was performed in BL21-CodonPlus(DE3)-RIPL cells in 101 LB medium starting from 
a single transformant with 30 \.M IPTG added immediately at the start of expression. 
Cells were grown for 4 h at 28°C, followed by 20 h at 25°C and 24 h at 20°C. Cell lysis 
and protein purification were performed as previously described’. 

The sequence of TcdB2-TccC3 without the HVR (empty TcB-TcC) 

was amplified from the TcdB2-TccC3 sequence using the primer pair 
GGATCCATGCAAAATTCACAAGATTTTAGTATTACGGAACTGTCAC and 
CTCGAGTTATAATCCATCAGGATCAAGGAGGGTAACTGG and cloned in 
pET28a via BamHI and Xhol, resulting in a TcdB2-TccC3 fusion construct ter- 
minated after L678 of TccC3. Empty TcdB2-TccC3 and all sensor-loop variants 
were expressed and purified as for the wild-type complex. 
Cryo-EM data acquisition. ABC(WT) and ABC(D651A) were applied to 
glow-discharged holey carbon grids (C-Flats-2/1, Protochips) and blotted inside 
a Cryoplunge3 (Cp3, Gatan) using 3-s blotting time at 94% humidity and plunge- 
frozen in liquid ethane. Grids were stored in liquid nitrogen. 

A dataset of ABC(D651A) was collected at the National Center for Electron 
Nanoscopy in Leiden (NeCEN) with a Cs-corrected FEI Titan-Krios equipped with 
a XFEG and operated at an acceleration voltage of 300 kV. Images were collected 
automatically using EPU (FEI). For every selected grid square, 4 positions within 
each hole were imaged. Images were collected at a magnification of 125,000 (nom- 
inal magnification 59,000 x) using a back-thinned Falcon-II (FEI) direct electron 
detector, corresponding to a pixel size of 1.1 A/pixel on the specimen level. 

Starting at 85 ms, seven frames (55 ms/frame) with a total dose of 15.4 e~ A? 
and one integrated image (motion uncorrected) with a total dose of ~35 e~ A~? 
were acquired and used for further image processing. A total of 4,958 images were 
collected in a defocus range of 0.8 to 2.8 jm. 

A dataset of wild-type ABC was collected at the Max Planck Institute of 

Molecular Physiology, Dortmund using the same hardware setup (Cs-corrected 
Titan Krios equipped with a XFEG and a Falcon II). Images were recorded using 
the automated acquisition program EPU (FEI) at a magnification of 122,870, corre- 
sponding to a pixel size of 1.14 A/pixel on the specimen level. Movie-mode images 
(3,068) were acquired in a defocus range of 0.8 to 2.5 jum. Each movie comprised of 
24 frames acquired over 1 s with a total cumulative dose of ~60 e~ A~?. 
Image processing. After initial screening of all ABC(D651A) micrographs, 4,729 
integrated images were selected for further processing. Single particles (137,733) 
were manually picked with e2boxer”’. The integrated images were also used to 
determine the contrast transfer function (CTF) parameters using CTER”, imple- 
mented in the SPHIRE software package''. Outlier images were removed using 
the graphical CTF assessment tool in SPHIRE"". Reference-free 2D classification 
and cleaning of the dataset was performed with the iterative stable alignment and 
clustering approach ISAC”’ in SPHIRE. ISAC was performed with a pixel size of 
6.875 A/pixel on the particle level. The ‘beautify’ tool of SPHIRE was then applied 
to obtain refined and sharpened 2D class averages at the original pixel size, showing 
high-resolution features (Extended Data Fig. 1h). Our previous 9.1 A 3D recon- 
struction of the ABC holotoxin (Electron Microscopy Data Bank code: EMD-2551) 
was used as initial reference for 3D refinement, after proper scaling and filtering to 
25 A. Three-dimensional refinement without imposing symmetry was performed 
in SPHIRE using MERIDIEN, which is a maximum likelihood-based 3D-structure 
refinement program, driven by the gold-standard Fourier shell correlation (FSC). 
After each refinement cycle, we took advantage of the ‘user-function’ option of 
MERIDIEN and, with the help ofa short Python script, automatically symmetrized 
the outer shell of the TcA component as previously described®”*. In brief, after each 
refinement cycle, the density of TcA was masked out and symmetrized with C5 
symmetry. The remaining density of TcB-TcC and the background were scaled 
to match the threshold of the density of TcA. Subsequently, the two densities were 
merged, automatically masked using an adaptive mask to remove background 
noise, and the resulting density was then forwarded and used as a reference for 
the subsequent refinement cycle. This procedure was performed to obtain global 
parameters. The user function was not applied and TcA was not symmetrized 
during the final local refinement rounds. Three-dimensional variability analysis, 
based on the final projection parameters using SPHIRE, revealed that the variance 
is mostly restricted within the TcB-TcC cocoon interior, thus further suggest- 
ing that the HVR is unfolded or flexible within the cocoon. Three-dimensional 
classification into five groups was performed using the SORT3D tool of SPHIRE 
with a 3D focused binary mask including only the cocoon interior. However, most 
probably owing to the intrinsic flexibility of the HVR, we were not able to improve 
the resolution for this structural region. Therefore, we continued with the complete 
dataset after ISAC, including 132,033 single particles, as the respective density map 
showed the highest resolution at the interface of TcA-TcB with TcC. 


The seven movie frames were then aligned and averaged using MotionCorr”. 
The 132,033 particles, obtained by processing of the integrated images, were then 
re-extracted from the low-dose and motion-corrected averages and were then sub- 
sequently subjected to few rounds of local refinement, using the ‘continue’ mode 
in MERIDIEN. The estimated accuracy of angles and shifts at the final iteration 
was measured to 0.72 degrees and 0.9 pixels, respectively. The resolution of the 
final density was estimated by using the FSC (adjusted for the full-size of the 
dataset) and a soft Gaussian mask and reported an average resolution of 4.22 or 
3.72 A, according to the FSC 0.5 or 0.143, respectively (Extended Data Fig. 11), 
and a B-factor of —49.43 A. The two half volumes were then merged and the 
resulting volume was masked and sharpened accordingly. Local FSC calculation 
was performed using the ‘local resolution’ tool in SPHIRE. This analysis showed 
that the core of the TcA complex was resolved to 3.2 A resolution (at FSC 0.143), 
whereas the upper part of the cocoon (corresponding to TcC) showed the lowest 
resolution ~4.5-5 A (at FSC 0.143). The rest of the density was resolved to the 
average resolution (Extended Data Fig. 1k). The reported values were consistent 
with the observed structural details. The density was filtered according to its local 
resolution using the 3D Local Filter tool in SPHIRE. 

The frames of the movie-mode images of the ABC holotoxin dataset were 

motion-corrected, weighted and averaged using Unblur and Summovie*’. In 
addition, we created averages of the first 8 motion-corrected frames, without dose 
weighting (total dose 20 e~ A~?). CTF estimation was performed with CTER on 
full-dose unweighted motion-corrected sums. Outlier images were removed using 
the CTF assessment tool in SPHIRE and 100 selected micrographs of large defocus 
were then manually picked from the full-dose weighted sums and the extracted 
particles were subjected to 2D classification using ISAC. Selected class averages 
were then used as templates to pick the complete dataset using Gautomatch. 
205,336 were picked, extracted and subjected to 2D classification using ISAC 
(Extended Data Fig. 1b). From the initial set of particles, the clean set used for 3D 
refinement contained 89,148 particles. For the refinement of this dataset we used 
the same multi-symmetry procedure as described above for the ABC(D651A) 
dataset. At the end of the refinement, we replaced the particles extracted from the 
full dose-weighted sums with particles extracted from the unweighted 20 e~ A~? 
motion-corrected sums and subsequently subjected this dataset to few rounds of 
local refinement, using the ‘continue’ mode in MERIDIEN. This improved the 
reconstruction significantly. The resolution of the final density was estimated after 
applying a soft Gaussian mask and reported an average resolution of 4.45 or 3.94 A 
according to FSC 0.5 or 0.143, respectively (Extended Data Fig. 1c). The B-factor 
was estimated to be —68.03 A”. The estimated accuracy of angles and shifts at the 
final iteration was 0.69 degrees and 0.8 pixels, respectively. The local resolution of 
the density was calculated using the ‘local resolution tool in SPHIRE (Extended 
Data Fig. le), and finally the density was filtered accordingly using the ‘3D local 
filter’ tool. All steps of image processing above were performed using SPHIRE 
unless stated otherwise!’. Details related to data processing are summarized in 
Supplementary Table 1. 
Bioinformatics tools. The geometry of the final refined model was evaluated with 
MolProbity**; data statistics are summarized in Supplementary Table 1. Shape com- 
plementarity was calculated with Sc!”, included in the ccp4-software package*’. 
Analysis of the channel size was performed with ChExVis™*. The interface of TcA- 
TcB with TcC was analysed using PDBePISA*. 

Homologues of TcdB2 were identified using protein BLAST** and sequences 
of 14 homologues were aligned in Clustal Omega*”. Sequence conservation in 
the TcA-binding region and in the acidic constriction site was analysed using the 
ConSurf server*® and visualized in UCSF Chimera. Homology models of TcB 
sequences were created in UCSF Chimera”? using the Modeller plugin”. Analysis 
of electrostatic potentials of the models in the clamp region (15 A radius) and 
clustering of the models was performed in PISA“". 

For visualization, analysis and preparation of figures and movies, we used 
Chimera*’, PyMOL (http://www.pymol.org/) and VMD®. 
Hydrogen-deuterium exchange-mass spectrometry. All proteins were prepared 
at an initial concentration of 2 1M in exchange buffer (40 mM HEPES pH 8.0, 
200 mM NaCl, 1 mM TCEP and 10% glycerol). Hydrogen—deuterium exchange 
was initiated by adding 5 \1l protein to 45 11 deuteration buffer (exchange buffer 
prepared in D,O). Samples were incubated at 25°C for different times (10-3000 s) 
before quenching by addition of 50 il ice cold quench buffer (100 mM sodium 
phosphate pH 2.2, 5 mM TCEP, 2 M GuHCl) to a final pH of 2.6. 

Quenched samples were immediately injected into a Waters ACQUITY UPLC 
M-class with hydrogen—deuterium exchange via a 50-1 sample loop. Digestion 
was performed using an Enzymate BEH-pepsin column (Waters) at a flow rate 
of 100 \tl min~! and temperature of 20°C. Peptides were trapped and desalted 
for 3 min at 100 «ul min“! before transfer to a 1.0 x 100 mm ACQUITY UPLC 
peptide CSH C18 column (Waters) held at 0°C. Peptides were eluted over 7 min at 
40 ,l min7! with an 8-40% acetonitrile gradient in 0.1 % formic acid pH 2.5. Mass 
analysis was performed on a Waters Synapt G2Si. T-wave ion mobility was used as 
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an orthogonal peptide separation step between the UPLC and mass spectrometer“, 
and ion guide settings were adjusted to minimize gas-phase back exchange as previ- 
ously described. Peptides were identified by analysing MS" data (data-independent 
acquisition of product ion spectra) for 4-5 undeuterated control experiments 
using PLGS 3.01 (Waters). Mass spectra of deuterated peptides were processed in 
DynamX 3.0 (Waters) and peak selection was manually verified for all peptides. 
All experiments were performed under identical conditions. Deuterium levels 
were therefore not corrected for back exchange and are reported as relative*’. All 
experiments were performed in duplicate. 

Crystallization of empty TcB-TcC, data collection and processing. The empty 
TcB-TcC complex first crystallized as thin sheets using the sitting-drop vapour 
diffusion method at 20°C by mixing 1 jl TcB-TcC with 1 1l reservoir solution con- 
taining 0.1 M tri-sodium citrate pH 5.5, 10% PEG 8000 and 10% ethylene glycol. 
The thin sheets were used to prepare a seed solution. Final crystals were obtained 
within 5-7 days by mixing 1 jl TcB-TcC with 1.5 1l reservoir solution and 0.5 1l 
seed solution. Prior to flash-freezing in liquid nitrogen, the crystals were soaked 
in reservoir solution containing 20% glycerol as a cryoprotectant. 

Diffraction data were collected at the PXII-X10SA beamline at the Swiss Light 

Source and were processed with the XDS package. Phases were determined 
by molecular replacement with PHASER“ using the crystal structure of wild- 
type TcB-TcC(WT) (PDB 409X°) as a search model. Empty TcB-TcC crys- 
tallized in primitive hexagonal space group P3221 with unit cell dimensions of 
232 x 232 x 142 A and one molecule per asymmetric unit. The structures were 
optimized by iteration of manual and automatic refinement using COOT“”’ and 
phenix.refine implemented in the PHENIX package toa final Rfree of 25%. Details 
related to data processing are summarized in Supplementary Table 2. 
Affinity determination using biolayer interferometry. Affinities of TcB- 
TcC(WT) and of all TcB-TcC variants to TcA were determined by biolayer interfer- 
ometry (BLI) using an OctedRed 384 (forteBio, Pall Life Sciences) and streptavidin 
biosensors. 

The different TcB-TcC complexes were biotinylated in 20 mM Hepes-NaOH pH 
7.3, 200 mM NaCl, 0.05% Tween20 (labelling buffer) with Sulfo- NHS-LC-Biotin 
(Thermo Scientific) in a 1:3 molar ratio for 2 h at room temperature, followed by 
16 hat 4°C. Unreacted biotin label was washed out using AmiconUlItra 100-kDa 
cut-off concentrators by diluting the sample two times with a tenfold volume of 
buffer and re-concentrating back to the original volume. 

Biotinylated TcB-TcC was immobilized on streptavidin biosensors at a concen- 
tration of 10 j1g/ml, followed by quenching with 5 j.g/ml biotin. BLI sensorgrams 
were measured in three steps: baseline (300 s), association (20 s for TcB-TcC(WT) 
and TcB-TcC(D651A) and 600 s for empty TcB-TcC, respectively), and dissocia- 
tion (200 s for TcB-TeC(WT) and TcB-TcC(D651A) and 600 s for empty TcB-TcC, 
respectively). The sensorgrams were corrected for background association of TcA 
on unloaded streptavidin biosensors. On- and off-rates of TcA binding were deter- 
mined simultaneously by a global curve fit according to a 1:1 binding model. All 
BLI steps were performed in labelling buffer with 0.3 mg/ml BSA. 

Intoxication assay. HEK293T cells (Thermo Fisher) were intoxicated with pre- 
formed holotoxin formed by TcA(WT) and TcB-TcC(WT) and sensor-loop 
variants. Cells (2 x 10*) were grown adherently in 400 jul DMEM/F12 medium 
(Pan Biotech) overnight and, subsequently, 0.5, 2 or 5 nM of holotoxin was added. 
Incubation was allowed to continue for 16 h at 37°C before imaging. Experiments 
were performed in triplicate. Cells were not tested for Mycoplasma contamination. 
Atomic modelling. We modelled the atomic structure of the ABC holotoxin 
complexes using an iterative combination of Rosetta de novo building”, iterative 
refinement™, relaxation, molecular dynamics flexible fitting (MDFF )°! and man- 
ual building in Coot!”. We performed all MDFF runs in VMD* and NAMD”, 
using the CHARMM 36m forcefield** and an implicit solvation model. We started 
the modelling with the ABC(WT) complex. We fitted TcC into the density using 
MDEFE given the comparatively low local resolution at the top of the protein. For 
TcA, we started from the previously determined structure® (PDB 1VW1), and 
build an initial model using amino acids 21-2325 by iteratively relaxing the model 
into the density with Rosetta and MDFFE At this stage, we imposed C5 symmetry 
on the model, given that the symmetry is largely preserved in this region. For 
TcB, we split the molecule in several independent regions. In all cases, we started 
from the previously determined structure of the complex (PDB 409X). Amino 
acids 1258-1319 were built from a homology model based on the YenB structure’ 
(see below), and iteratively refined in Rosetta. Most of the 8-propeller region of 
TcB (amino acids 370-700) was built de novo owing to the large conformational 
change. We started by relaxing the 6-propeller into the density using Rosetta. Then, 
we removed all amino acids belonging to blades 3 and 4 (amino acids 442-553) 
and ran several consecutive rounds of Rosetta de novo building. We completed 
the 8-propeller models using RosettaCM™ and refined it using iterative rounds 
of Rosetta iterative local refinement and MDFF. Finally, we reassembled full TcB 
using iterative rounds of Rosetta relaxation and MDFF. We then combined TcB 
with amino acids 2326-2516 of TcA, relaxed the complex in Rosetta and further 
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added TcC into the model. We then relaxed this partial ABC holotoxin complex in 
Rosetta, and used it as input for all molecular dynamics simulations (see below). 
We finally combined to full ABC holotoxin complex and ran several iterative 
rounds of relaxation in Rosetta and MDFF. For MDFF at this step, we imposed 
symmetry locally in amino acids 21-2325 of the TcA. The final relaxation rounds 
were performed without imposing any symmetry. We used the ABC(WT) complex 
as input for modelling ABC(D6514A), and refined the model using iterative rounds 
of Rosetta relaxation and MDFF. 

Finally, amino acids without density support were removed from the final 

models. In all cases, the quality of the models was judged according to their 
Molprobity*’, and EMRinger® scores, as well as the integrated FSC values reported 
by Rosetta®. 
Classical molecular dynamics simulations. To test the stability of the differ- 
ent conformations of TcB, we performed simulations of the TcB-TcC complex 
in its open and closed conformations. For the open state, we started the simula- 
tions directly from the partial ABC holotoxin models described above. Because 
the structure of the closed conformation is missing some amino acids, we used 
MODELLER® to complete the model of the protein based on the already known 
crystal structure® (PDB 409X) and the structure of the homologue YenBC*” (PDB 
4IGL). We used Rosetta to relax this model back into the electron density, to correct 
any possible errors introduced during the modelling steps. 

We assigned the protonation state of ionizable residues at neutral pH, according 
to the pK, calculated using the Rosetta pH protocol®®. We solvated these models 
in a dodecahedral box of TIP3P water extending at least 1.5 nm away from every 
protein atom. We then neutralized the system with NaCl at an ionic strength of 
~150 mM, resulting in typical system sizes of ~440,000 atoms. The systems were 
equilibrated using a step-wise protocol. After minimization, we restrained all the 
protein heavy atoms and slowly heated the system to 300 K (tau= 1 ps) during 
0.1 ns at constant volume. We then maintained the temperature at 300 K and 
adjusted the pressure to 1 atm using a Berendsen barostat (tau =2 ps) during 0.9 ns. 
We then switched to the Parrinello-Rahman barostat and simulated for another 
1 ns. After this, we removed the positional restraints on the protein side chains and 
simulated for 5 ns. Finally, we removed all restraints on the protein and simulated 
for additional 5 ns. From here, we performed 1.05 1s of production run using 
the same settings. All simulations correspond to Langevin dynamics and 
were performed in GROMACS”, using the CHARMM 36m force field®?. The 
real-space calculation of short-range-non-bonded interactions was truncated at 
1.2 nm, with van der Waals forces switched slowly to 0 from 1 nm. Long-range 
electrostatics were calculated using the particle mesh Ewald algorithm. To use a 
time step of 2 fs we constrained all bonds involving hydrogen atoms using the 
SETTLE algorithm. 

We performed two simulations for the closed 3-propeller state and four for the 
open 3-propeller state, each started from a system with randomized ion positions. 
Only the last 0.8 j1s were considered in the final analysis of the trajectories. All 
analyses were performed with VMD and GROMACS. 

To test whether a folded helix can be translocated through the constriction site 
of TcB, we performed another set of simulations. For this, we predicted the second- 
ary structure of the ADP-ribosyltransferase using the NPS@ server®. Two peptides, 
residues 815-825 (peptide 1) and 892-899 (peptide 2) of TcC, showed high helical 
propensity. We modelled them as ideal helices using MODELLER, placed their 
mass centre at the constriction site of TcB-TcC in the open state, and oriented 
them with their C terminus towards the exit of the channel. We capped the N and 
C termini of the peptide with an acetyl and N-methyl amide group, respectively, to 
prevent the charge of the termini from affecting the dynamics. We then prepared 
the systems for simulation using the same protocol described above, and equili- 
brated them with a small modification. Before the last step in which all restraints 
are released, we performed 5 ns simulation where the backbone atoms of TcB-TeC 
are still restrained but the ligand helices are free to move. To enforce the helical 
conformation, we used the c-r.m.s.d. coordinate in PLUMED" and placed a soft 
lower wall (k=500 kJ mol}; centres 5.5 and 2.5 for peptides 1 and 2, respectively) 
to prevent the helix from unwinding. To prevent the helices from scaping towards 
the cocoon, we also placed a soft upper wall controlling the distance between the 
centre of mass of the peptide’s C, atoms and the C, atoms of residues 406, 461, 
511, 567, 620 and 672 (k=500 kJ mol~! nm~?, centre 2.2 nm). Finally, we kept 
the 6-propeller of TcB from collapsing by placing a soft upper wall on the distance 
r.m.s.d. of the C, atoms of residues 370-700 from TcB (k= 500 kJ mol7! nm~?, 
centre 1 nm). We used the staring open propeller as references and considered only 
inter-atomic distance between 0.3 and 0.8 nm. All wall restraints were included 
using PLUMED. For each system, we performed two 300-ns-long production runs. 
In one of these we released the helical restraint on the peptide ligand, whereas in 
the other we kept it for the full simulation time. 

Structure-based molecular dynamics simulations. To simulate the transition 
between open and closed states of the TcB 3-propeller region, we performed 
120 structure-based molecular dynamics simulations of the TcB-TcC complex, 
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60 starting from the closed and 60 from the open state. To study the effect of the 
binding to TcA, we performed 80 simulations including amino acids 2326 to 2516 
of TcA, 40 starting from the closed and 40 from the open state. We refer to these as 
holotoxin simulations. We used only the TcB-binding domain of TcA, primarily to 
save computational time. This is unlikely to affect the calculations as this region is 
connected by a single loop to the rest of TcA and has few tertiary contacts with the 
rest of the protein, essentially constituting an independent domain. In addition, 
the contact map calculated by SMOG (see below) extends only 0.6 nm away, guar- 
anteeing that TcB will not interact with the rest of TcA. To further stimulate the 
conformational change, we performed 120 additional holotoxin simulations with 
a debilitated 8-propeller (see below), 60 starting from the closed and 60 from the 
open state. All individual simulations were run at a T= 125 using an integration 
step of 0.002 (Gromacs reduced units), and lasted 150,000,000 time steps. 

In all cases, we used GROMACS and a Gaussian contact version of the 
SMOG force field®. To include both structures into the topology-based potential, 
we used a strategy similar to the one described”!, with a modified potential of 
the form: 


V= SS €(r 1) » € gO 8)” 


bonds angles 
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backbone side chains 
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in which ro, 09 and yo were taken from the native structures. The definition of 
all energy constants and their values have been described?! , For ro and 69, 
we used the average between the two states, as they are mostly invariant. Impropers 
restrain planar groups and the configuration of prolines. Therefore, we used 
also the average for xo for any angle for which the difference between both 
structures was less than 10 degrees. In all other cases (some prolines), we trans- 
ferred the term from the closed state, as this represents a model from a higher 
resolution map. 

C; can be defined in two different ways. For all atomic contacts present in only 
one of the structures, we used a single Gaussian potential with an exclusion term. 
For contacts present in the open and closed conformations, we used a dual basin 
Gaussian potential, essentially as described”! (see Extended Data Fig. 6g, h). For 
the simulations with destabilized 3-propeller region, we decreased the strength of 
all contacts involving 3-propeller atoms to 90% of their original strength. 

A dual dihedral energy term Fp(#) was defined essentially as in ref. 7! (see 
Extended Data Fig. 6g, h). In brief, the term for a single dihedral is defined as 


Fp (4) = 1—cos(d—¢,) + 0.5(1—cos(3(¢—¢,))) 


in which ¢o represents the dihedral observed in the structure. To create a term 
that includes minima for two states (a and b) we defined a conditional function. 
We first identified av1, the average between the dihedral angles observed in each 
structure; we then identified av2, which corresponds to av1 + 180°. Assuming that 
state a has the smallest dihedral angle, the new dual Fp(¢) function corresponds to 
Fp(¢)q for av2 < ¢ <avl, and to Fp(¢), for avl < @ < av2. To properly connect the 
halves of the function, we identified the appropriate extrema closest to avl or av2, 
and connected them using the function Fp(¢) =k(¢ — q)” (¢ — p)? + ¢, in which 
k, q, p and c were chosen so that the first and second derivatives were the same as in 
the cosine function at the connecting points. For an overview of the final functions 
please see Extended Data Fig. 6g, h. All files for the simulations were prepared with 
a standalone version of SMOG2™ and modified using in-house-built scripts. All 
analyses were performed in VMD. Two atoms were defined as forming their native 
contact if their distance was less than 1.2 times the distance observed in the native 
structure. Using the native contact maps, we defined a collective variable corre- 
sponding to the difference between the number of contacts in the closed and open 
states of each blade. We used this to follow the conformational change. To reduce 
the noise level, the collective variable trajectories were smoothed using running 
averages calculated with a window size of 150,000 time steps. Using these values, 
we considered that a blade had lost its state when it had less than 105 contacts 
formed. To characterize the conformational change mechanism, we combined all 
trajectories and used the new collective variable to build the histograms shown in 
Fig. 2 and Extended Data Fig. 7. 

Reporting Summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 
The densities and atomic coordinates of ABC(WT) and ABC(D651A) have been 
deposited in the Electron Microscopy Data Bank under the accession numbers 


EMD-0149 and EMD-0150, respectively, and in the Protein Data Bank under 
accession numbers 6H6E and 6H6E respectively. The atomic coordinates of the 
crystal structure of TcdB2-TccC3 without HVR have been deposited in the Protein 
Data Bank under the accession number 6H6G. The HDX data are available in the 
Source Data for Fig. 4. The datasets generated and/or analysed during the current 
study are available from the corresponding author upon reasonable request. 
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Extended Data Fig. 1 | See next page for caption. 
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Extended Data Fig. 1 | Cryo-EM of the wild-type ABC and 
ABC(D651A) holotoxin complex. a, Typical digital micrograph area of 
vitrified wild-type holotoxin complexes at a defocus of 2 jm and a total 
dose of 60 e~ A~* acquired with a Falcon II direct electron detector. Scale 
bar, 60 nm. b, Representative reference-free 2D class averages obtained 

by ISAC and subsequently resampled to the original pixel size, refined 
and sharpened, using the Beautifier tool implemented in the SPHIRE 
software package. Scale bar, 20 nm. c, Fourier shell correlation (FSC). The 
0.143 FSC cut-off criterion indicates that the cryo-EM map has an average 
resolution of 3.94 A. The inset shows a representative area of the density 
map superimposed with the atomic model. d, Angular distribution for the 
final round of the refinement. Each stick represents a projection view. Size 
and colour of the stick is proportional to the number of particles. 

e, Surface and cross-section of the cryo-EM density map coloured 
according to the local resolution. f, Molecular model of the wild-type 
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holotoxin complex coloured by B-factor. g, Typical digital micrograph 
area of vitrified ABC(D651A) complexes at a defocus of 2 jum and a total 
dose of 35 e~ A~?. Scale bar, 60 nm. h, Characteristic reference-free 2D 
class averages of ABC(D651A) complexes. Scale bar, 20 nm. i, The 0.143 
FSC cut-off criterion indicates that the cryo-EM map of ABC(D651A) 
has an average resolution of 3.72 A. The inset shows a representative area 
of the density map superimposed with the molecular model. j, Angular 
distribution for the final round of the refinement. k, Surface and cross 
section of the cryo-EM density map coloured according to the local 
resolution. 1, Molecular model of the ABC(D651A) holotoxin complex 
coloured by B-factor. Note that the TcC density in both volumes shows 
comparatively low local resolution and the molecular models of these 
specific areas were obtained by flexible fitting of the available crystal 
structure using MDFF and subsequent local refinement with Rosetta. 
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Extended Data Fig. 2 | See next page for caption. 
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Extended Data Fig. 2 | Details of the TcA-TcB interface. a, Interactions 
at the TcA-TcB interface. Binding of TcA to TcB-TcC is stabilized by 
interactions between the pseudo-six-fold symmetrical 3-propeller of TcB 
and the five-fold symmetrical TcB-binding domain of TcA. The gatekeeper 
domain, shown in blue, undergoes the largest conformational changes 
during gate opening. In the open state, residues R485 (i) and K534 (iv) 
of blades 3 and 4, respectively, are positioned within negatively charged 
grooves of the TcB-binding domain of TcA. In addition, two copies of 
residue L2422 of two adjacent TcA subunits are positioned within a 
prominent hydrophobic groove of blade 3 (ii) and 4 (iii), respectively. 
Interacting residues of TcB are shown as sticks. Surfaces of TcA involved 
in the interfaces are coloured from high (orange) to low (white) 

from —10 kcal mol! (red) to 10 kcal mol™! (blue) at pH 7.5 (i, iv, v)). 
Similar to the interfaces between blade 3 and 4 and TcA, blade 5 forms 
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strong hydrophobic interactions with residue L2422 of TcA (vi) and 
electrostatic interactions with negatively charged patches of the opposing 
TcA domain (v). In contrast to the other blades, no prominent electrostatic 
or hydrophobic interactions can be observed with high certainty at this 
interface. Several candidates for amino acid residues were identified as 
putative hydrogen bond donors or acceptors (vii, viii). Colours correspond 
to those in Fig. 1. b, Conservation of residues at the TcA-TcB interface. 
Positions of residues of the 8-propeller domain of TcB interacting with 
TcA are shown as in a. The model of the 8-propeller domain is coloured 
according to sequence conservation, with cyan representing non- 
conserved residues and magenta representing highly conserved residues. 
The TcB-binding domain of TcA is shown in light green. c, Sequence 
alignment of TcB sequences. Asterisks indicate the positions of the 
residues highlighted in b. The sequence of P Iuminescens TcdB2 is 
outlined. The alignment is coloured according to b. 
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Extended Data Fig. 3 | Binding affinities of TcB-TcC, TcB-TcC(D651A) 
and empty TcB-TcC for TcA and of TcB-TcC for TcA(L2422E). 

a, Interaction of TcA with TcB-TcC(WT), empty TcB-TcC, or TcB- 
TcCp¢s1a, and interaction of TcA(L2422) with TcB-TcC were measured 

by BLI. Kp, kon and ko¢r obtained from global fits are shown. Data are 

mean + error of fit; 5-7 individual curves were included in the global 

fits. b-e, BLI sensorgrams of TcA(WT) interacting with immobilized 
TcB-TcC(WT) (b), TcB-TcC(D6514A) (c), empty TcB-TcC (d) and 
TcA(L2422E) interacting with immobilized TcB-TcC(WT) (e). TcA 
pentamer concentrations were 1.25 nM-80 nM in b and c, 16-1,000 nM in 
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d and 2.8-180 nM ine. A global fit according to a 1:1 binding model was 
applied (black dashed curves). Resulting Kp, ko, and kog values are shown. 
Association and dissociation phases are separated by a grey dotted line. 

f, g, Negative-stain electron micrographs of TcA(WT) (f) and 
TcA(L2422E) (g) incubated with wild-type TcB-TcC. TcA (200 nM) was 
incubated with TcB-TcC (300 nM), and the excess of free TcB-TcC was 
removed by size-exclusion chromatography before imaging. Red circles 

in g highlight side views of TcA without TcB-TcC. Experiments were 
performed three times with comparable results. Scale bar, 200 nm. 
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Extended Data Fig. 4 | Opening of the 8-propeller gate of TcB. 

a, Opening of the 6-propeller gate is triggered by sensor loops. Side views 
of the closed, (left; before binding to TcA) (PDB 409X) and open (right; 
after binding to TcA) state of the TcA-binding six-bladed 3-propeller 
domain of TcB-TcC. The structure of the closed state was structurally 
aligned with the structure of the open state and is shown together with 
the TcB-binding domain of TcA (middle). Note the clashing loops (sensor 
loops) between TcA (residues 2418-2430, green) and TcB (residues 
527-536, orange). Colours correspond to those in Fig. 1. b, c, Effect 

of the ADP-ribosyltransferase on the structure of TcB-TcC. Crystal 
structures of unbound wild-type and empty TcB-TcC are shown in b. 
TcB-TcC structures obtained from the cryo-EM structures of ABC(WT) 
and ABC(D651A) are shown in c. For the structures of TCcB-TcC(WT), 
the channel radius profile is shown as a function of distance along the 
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channel axis before and after TcA binding in b and c, respectively (left). 
The narrowest constriction of the channel lumen towards the TcA-binding 
domain (constriction site) has a diameter of 10.5 A in the open state. For 
all structures, the mesh surface of the computed channel along the cocoon 
interior is shown in yellow. Note that the presence or absence of the ADP- 
ribosyltransferase does not affect the channel profile in the unbound state 
of the cocoon b. Note also, that cocoons with cleaved (TcB-TcC(WT)) or 
uncleaved (TcB-TcC(D651A)) ADP-ribosyltransferase show an almost 
identical channel profile following binding to TcA (c). d, e, Comparison of 
crystal contacts in the structures of empty TcB-TcC and TcB-TcC(WT). 
Top and side views of the crystal structures of empty TcB-TcC (d) and 
TcB-TcC(WT) (PDB 409X) (e) including crystal contacts. The TcA- 
binding domain of TcB is indicated by a dashed box and coloured as in 
Fig. 1b, c. The gatekeeper hairpin (residues 514-524) is highlighted in red. 
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Extended Data Fig. 5 | See next page for caption. 
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Extended Data Fig. 5 | Analysis of TcA and TcB sensor-loop mutants. 

a, Analysis of the first purification step (Ni?*-affinity chromatography) 

of TcA(WT) and sensor-loop deletion variants. P, insoluble fraction; 

S, soluble fraction; F, flow-through; M, protein marker. The gradient 
corresponds to 20-300 mM imidazole. Marker proteins: 250, 180, 130, 
100, 70, 50 and 40 kDa. TcA is marked by an asterisk. Purification was 
performed once for each TcA variant. Purification of TcA(WT) resulted in 
comparable results for more than 5 experiments. b, Kp, kon and kog¢ values 
of the global fits obtained from the BLI measurements (c-f), analogous 

to Extended Data Fig. 3a. Data are mean + error of fit; 6-7 individual 
curves were included in the global fits. c-f, BLI sensorgrams and binding 
affinities of TcB-TcC sensor-loop mutants with TcA. BLI sensorgrams of 
TcA(WT) interacting with immobilized TcB(F532A)-TcC (c), TcB(D530A/ 
F532A)-TcC (d), TcB(A528-534-1Gly)-TcC (e) and TcB(A528-534- 
2Gly)-TcC (f) (-Gly indicates glycines replacing the TcB sensor loop). 

A global fit according to a 1:1 binding model was applied (black dashed 
lines). TcA pentamer concentrations were 3.75-240 nM in c-e and 

2.5-80 nM in (f). Association and dissociation phases are separated by a 
grey dotted line. g, Intoxication of HEK293T cells with holotoxin formed 
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by TcA(WT) and the indicated TcB-TcC variants. Scale bars, 200 pm. 
Cells (2 x 10*) in DMEM/F12 medium were incubated with 0.5 or 5 nM of 
holotoxin for 16 h at 37°C before imaging. Experiments were performed 
in triplicate with qualitatively identical results. The wild-type holotoxin 
and the complexes with the two sensor-loop point mutations show strong 
cytotoxic effects. Both loop deletion variants, however, are not toxic to 
cells, even at 10 times higher concentration. h, i, Holotoxin complex 
formation between TcA(WT) and L2422E and TcB-TcC sensor-loop 
variants. h, Chromatograms of TcB-TcC(WT) and selected variants with 
TcA(WT) (solid lines) and TcA(L2422E) (dashed lines). TcA pentamer 
(200 nM) and TcB-TcC (400 nM) were incubated for 1 h at 22 °C before 
loading on a Superose 6 5/150 column (GE Life Science). Experiments 
were performed twice with identical results. i, Electron micrographs 

of different combinations of TcA and TcB-TcC. The main peaks of the 
chromatography runs (h) were negatively stained and imaged. Scale bar, 
200 nm. Experiments were performed twice with qualitatively identical 
results. All sensor-loop mutants show almost exclusively holotoxins when 
mixed with TcA(WT), but practically no holotoxins with TcA(L2422E). 
TcB-TcC(WT), however, can form a holotoxin with TcA(L2422E). 
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Extended Data Fig. 6 | Structural stability of the B-propeller of TcB 
during molecular dynamics simulations. Principal component analysis 
of the trajectories started from the closed (a-c) or open (d-f) states of 
the TcB-TcC complex. The plots show the root mean square fluctuation 
associated with the first (a, d), second (d, e) and third (c, f) principal 
components. The percentages in the legends indicate the fractional 
contribution that each component makes to the total variance. The 
structures show the range of conformations observed along their 
corresponding component. The colour scale in the structures represents 


the position along each principal component going between the extremes, 


from red to green to blue. For guidance, the blades of the 6-propeller are 
labelled in structures and plots, with blades 3 and 4 highlighted in blue. 
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g, h, Graphical representation of the hybrid potentials used for the dual 
conformation structure-based SMOG force field. A Gaussian potential (g) 
can be used to create a contact term that includes exactly two minima 
(states a and b) with equal well depth, corresponding to the observed 
distance in two independent structures. In addition, by using a Gaussian 
potential the excluded volume of the contact can be independently 
controlled. A representation of the hybrid dihedral angle potential used in 
this study is shown in h. Starting from the dihedral functions for states a 
and b, the functions are combined as described in Methods. The regions 
between the averages (av1 and av2) and the closest extrema are connected 
using a polynomial function, to guarantee continuity and differentiability. 
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Extended Data Fig. 7 | Conformational distribution observed in the 
structure-based molecular dynamics simulations. a, Distribution of 
states of blades 3 and 4 during the simulations of free TcB-TcC. 

b, Histogram of the conformations sampled by the gatekeeper domain 
during the simulations of free TcB-TcC complex. c, Distribution of states 
of blades 3 and 4 during the simulations of the holotoxin. d, Histogram 
of the conformations sampled by the gatekeeper domain during the 
simulations of the holotoxin. In the simulations started from the 

closed state, the protein explores two minor conformations, which 
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are highlighted in the plot and shown in g. e, Distribution of states of 
blades 3 and 4 during the simulations of the holotoxin with destabilized 
68-propeller. f, Histogram of the conformations sampled by the gatekeeper 
domain during the simulations of the holotoxin with destabilized 
68-propeller. g, Representative structures for the two states highlighted in 
d. Colours correspond to those in Fig. 2. The histograms in d and f were 
calculated using running averages of the trajectories, using a window size 
of 150,000 time steps. 
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Extended Data Fig. 8 | Conservation of negatively charged residues 

that form a constriction between the barrel and the B-propeller domain 
of TcB. a, Positions of the residues D34, N60, D73, E100 and E120 in the 
constriction site of TcB. The model is coloured according to sequence 
conservation, with cyan representing non-conserved residues and magenta 
representing highly conserved residues. b, Sequence alignment of TcB 
sequences. Asterisks indicate the positions of the residues highlighted in a. 
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Extended Data Fig. 9 | Modelling of a helical peptide at the constriction 
site of TcB-TcC. We placed two different peptides with predicted «-helical 
secondary structure from the HVR of TcC and performed simulations 
with the peptides free (unrestrained) or restrained to their helical 
conformation. a, Central section through TcB-TcC showing the average 
water occupancy during the 300-ns-long simulations. Bulk water is shown 
in white (~0.5 occupancy), whereas bound water molecules appear in red. 
Blue represents the regions that are never hydrated, which corresponds to 
the space occupied by the protein. The occupancy shows that there is still 
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a small channel of water next to the helical peptide. b, c, Representative 
snapshot of the trajectories. TcB (cyan) and TcC (violet) are shown as cut 
surfaces to illustrate the interior of the barrel. The bound helix is shown as 
a blue ribbon. Side chains of the bound helix and water molecules at most 
8 A away from it are shown in ball and stick representation. For each 
simulation, we included a close-up of the bound helix (c). d, Stability 

of the helical ligand bound to the constriction site. The variable 

Qrm.sd. represents the number of 6-residue-long segments with helical 
conformation within the peptide. 
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Extended Data Fig. 10 | HDX-MS of TcB-TcC and holotoxin complexes. 


a, Relative deuteration of TcB-TcC and holotoxin complexes at 50-min 
deuteration time. Relative deuteration values normalized to the maximum 
deuteration value of all peptides are shown in a gradient from 0.0 (blue) to 
0.64 (red) relative deuteration level. Peptides without deuteration data are 
coloured grey. The position of the gatekeeper hairpin (residues 514-524, 
Fig. 4b) is indicated by a box. b, c, HDX-MS of residues 463-471 as 

part of the TcA interface in bound and unbound TcB-TcC. The position 
of residues 463-471 of the TcA-binding domain in unbound TcB-TcC 


TcB-TcC WT 
Ry S 

ee 

CS ./} » 


TcB-TcC D651A 


Cc 
3 

ow 
=) 

| 
5* 

= 

© 

— 

2 

=| 

® 
xe) 

o 

21 

® 

oO —e— Empty TcB-TcC 
oc —@— TcB-TcC WT 


—m— ABC D651A 


0 10 20 30 40 50 
Deuteration time (min) 


(b, left, red) and TcA-bound TcB-TcC (b, right, red) is indicated. The 
TcB-binding domain of TcA is shown in green (b, right). Equilibrium 
hydrogen-deuterium exchange at residues 463-471 in empty TcB-TcC 
(black circles), TcB-TcC(WT) (purple circles), TcB-TeC(D651A) (blue 
triangles), ABC(WT) (green triangles) and ABC(D651A) (light green 
squares) is displayed in c. Relative deuterium uptake of the residues as 

a function of incubation time in D2O is shown. Data points show two 
independent replicates of each measurement. The solid lines represent the 
arithmetic mean of both replicates. 
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Major galaxy mergers are thought to play an important part 
in fuelling the growth of supermassive black holes’. However, 
observational support for this hypothesis is mixed, with some 
studies showing a correlation between merging galaxies and 
luminous quasars?” and others showing no such association*®. 
Recent observations have shown that a black hole is likely to become 
heavily obscured behind merger-driven gas and dust, even in the 
early stages of the merger, when the galaxies are well separated®® 
(5 to 40 kiloparsecs). Merger simulations further suggest that such 
obscuration and black-hole accretion peaks in the final merger 
stage, when the two galactic nuclei are closely separated? (less than 
3 kiloparsecs). Resolving this final stage requires a combination of 
high-spatial-resolution infrared imaging and high-sensitivity hard- 
X-ray observations to detect highly obscured sources. However, 
large numbers of obscured luminous accreting supermassive 
black holes have been recently detected nearby (distances below 
250 megaparsecs) in X-ray observations!”. Here we report high- 
resolution infrared observations of hard-X-ray-selected black 
holes and the discovery of obscured nuclear mergers, the parent 
populations of supermassive-black-hole mergers. We find that 
obscured luminous black holes (bolometric luminosity higher than 
2 x 10“ ergs per second) show a significant (P < 0.001) excess of 
late-stage nuclear mergers (17.6 per cent) compared to a sample of 
inactive galaxies with matching stellar masses and star formation 
rates (1.1 per cent), in agreement with theoretical predictions. Using 
hydrodynamic simulations, we confirm that the excess of nuclear 
mergers is indeed strongest for gas-rich major-merger hosts of 
obscured luminous black holes in this final stage. 

The Burst Alert Telescope (BAT) on the Neil Gehrels Swift Observatory 
has surveyed the entire sky at unprecedented depths in the ultra-hard 
X-ray band (14-195 keV) and primarily detects accretion onto super- 
massive black holes at the centres of nearby galaxies. Detection in the 
ultra-hard X-ray band is possible even when obscuring gas and dust in 
the host galaxy considerably attenuate the ultraviolet, optical or softer 
X-ray emission around the growing black holes. At the distance to the 
nearest luminous accreting black hole (about 220 Mpc or at a redshift of 

0.05), ground-based optical imaging typically achieves a resolution 
of the order of 1”, or 1 kpc, in the host galaxy. This spatial resolution is 
not sufficient to resolve the final merger stage in the host galaxies down 
to the hundreds of parsecs. However, these can be resolved with near- 
infrared adaptive optics, which provide an improvement bya factor of 10 
in spatial resolution (about 0.1”). These scales are still above the black- 
hole sphere of influence, which is of the order of 10-100 pc for black holes 
with masses of 107M5-10°M.j (Mo, solar mass). 

We observed 96 nearby (z< 0.075) black holes detected by Swift/BAT 
in the hard X-ray band. The black holes were selected at random over 
a wide range of luminosities using the adaptive optics system on the 


Keck 2 telescope at the W. M. Keck Observatory with a near-infrared 
camera (NIRC2). These near-infrared observations (2.1 {1m) include 
both obscured and unobscured accreting black holes and have an average 
spatial resolution of 0.13”, about a factor of 10 better than previous 
ground-based surveys. We combined these adaptive-optics observa- 
tions with available high-resolution archival Hubble Space Telescope 
(HST) near-infrared images of 64 Swift/BAT-detected active galactic 
nuclei (AGN) with an average spatial resolution of 0.17”. These obser- 
vations provide the first evidence of a sizeable population of double 
nuclei with very small separations (0.3-3 kpc) in late-stage mergers, 
which could not be detected in lower-resolution ground-based optical 
observations and were not detected in previous near-infrared samples 
of AGN observed with the HST (Fig. 1d-f). 

We separated our sample into obscured and unobscured accreting 
black holes based on the presence of broad Hf lines in optical spec- 
troscopy images from past studies’! and into low- and high-luminosity 
(below and above a bolometric luminosity of Ly.1=2 x 10“ erg s~!, 
respectively) using their X-ray emission’”. We also compared our sample 
with 176 inactive galaxies matched in stellar masses and star formation 
rates that have high-resolution HST near-infrared images. Example 
high-resolution images of the inactive-galaxy sample are provided in 
Extended Data Fig. 3. A comparison of the stellar masses (Extended 
Data Fig. 4a), star formation rates (Extended Data Fig. 4b), physical 
resolutions (Extended Data Fig. 4c) and the consistency of the control 
sample with random inactive galaxies taken from the Sloan Digital 
Sky Survey (Extended Data Fig. 6) show that (Fig. 2a) the obscured 
luminous black holes show a significantly (P < 0.001) higher fraction 
(17.6%, 6/34) of nuclear mergers (<3 kpc) than inactive galaxies (1.1%, 
2/176), unobscured luminous black holes (1.8%,1/55) and lower- 
luminosity black holes (2.7%, 2/73). When comparing the fractions of 
nuclear mergers for obscured and unobscured luminous black holes, 
the difference is also significant (P+ 0.01). Finally, a higher proportion 
of nuclear mergers (separation R < 3 kpc) in obscured luminous AGN 
is also found when comparing them with lower-luminosity black holes 
(P~0.01). At larger separations (3-10 kpc) the fraction of mergers in 
obscured luminous black holes is higher than in the other samples, but 
this difference is not statistically significant (P > 0.29). All the mergers 
identified at R< 10 kpc are listed in Extended Data Table 1. Finally, we 
note that even observations in the near-infrared band may sometimes 
miss nuclear mergers with very heavy extinction’, so these measure- 
ments should be seen as a lower limit. 

While past work has found some nuclear mergers, our study is the 
first, to our knowledge, to demonstrate a significant excess (P < 0.001) 
of nuclear mergers in obscured luminous black holes in comparison 
to a matched sample of inactive galaxies. Past studies have typically 
focused on subsets of AGN galaxies at larger separations (for example, 
10-30 kpc). For instance, some nuclear mergers have been identified 
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Fig. 1 | Example images of final-stage mergers. a—c, Tricolour 

optical images in the gri band from the Sloan Digital Sky Survey or the 
Kitt Peak survey with 1” angular resolution. The galaxies shown are 
2MASX J01392400+2924067 (a), CGCG 341-006 (b) and MCG+02-21- 
013 (c). The images are 60 kpc x 60 kpc in size. Red squares indicate 

the size of the zoomed-in adaptive optics image on the right. 

d-f, Corresponding near-infrared, K,-band (effective wavelength, 

2.12 jum) adaptive optics images of nuclear mergers taken with the Keck/ 
NIRC2 instrument. These images are 4 kpc x 4 kpc in size. 


in sources with double-peaked [O 111] 45,007 emission lines, which 
result from the emission from both nuclei'*. In a sample of 60 double- 
peaked sources observed with NIRC2", only 4/60 (or 6.7%) were in 
major mergers with <3 kpc separations—a much smaller proportion 
than that seen in the obscured luminous black holes studied here. 
Some nuclear mergers have also been detected in the host galaxies’® 
of luminous infrared galaxies with accreting black holes, which have 
very high star formation rates, probably associated with the merger. 
However, only one of the ten nuclear mergers in our hard-X-ray sam- 
ple is associated with a luminous infrared galaxy (NGC 6240), and 
none show double-peaked [O 111] 45,007 emission lines. This indicates 
that both of these diagnostics are incomplete indicators of nuclear 
mergers. 

When considering the fractions of galaxies found at various merger 
stages, it is critical to consider the corresponding observability timescale, 
because the time spent at small separations is thought to be much shorter 
than that spent at larger separations. For instance, in a recent merger 
simulation study””, the time spent at separations R <3 kpc could be more 
than five times shorter than the time spent at separations of 3-10 kpc 
(about 50 Myr versus 300 Myr). Thus, the excess fraction of nuclear 
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mergers in obscured luminous black holes that we find in our data is 
surprising, and it probably reflects a strong link between such mergers 
and intense black-hole accretion. To compare our observations with 
theoretical results more directly, we use a suite of state-of-the-art 
high-resolution hydrodynamical galaxy merger simulations with the 
galaxy stellar mass, black-hole mass and black-hole bolometric lumi- 
nosity set up to reproduce the accreting black holes and their host 
galaxies observed in our study (GADGET-3 code’®; see Methods). 
We also consider the simulated mergers at random orientations to 
the observer to account for the fact that some mergers would appear 
closer simply because of the projection effect (that is, their alignment 
with the observer's line of sight). 

The simulations show that obscured luminous black-hole phases 
preferentially occur in the late stages of gas-rich (Mgas/M+ < 0.1) major 
(M,/M2 <5) mergers, where M,,, is the gas mass, M: is the stellar mass 
and M, (M2) denotes the galaxy with the larger (smaller) stellar mass. 
Consistent with our observations, late-stage mergers are less preva- 
lent in lower-luminosity black holes and inactive galaxies (Fig. 2b). 
Finally, our simulations show that obscured luminous black holes, 
which occur in the post-merger phase (after the two galactic nuclei 
and black holes have merged), contribute as much to the growth of 
the obscured black hole as the entire merger phase (R < 30 kpc). We 
note that during the late stages of our simulated galaxy mergers, the 
black holes spend very little time in an unobscured luminous accreting- 
black-hole phase. These results are consistent with previous theoret- 
ical work” that showed that merger-triggered accreting black holes 
are preferentially more luminous and obscured than those growing by 
stochastic feeding via slower secular processes. This explains the lack 
of nuclear mergers in low-luminosity AGN seen in a previous large- 
sample (>200) high-resolution study of AGN and normal galaxies 
using the HST”°. Moreover, simulations find that although global star 
formation is enhanced primarily in the early stages of the first merger 
passage, black-hole growth is minimal until the late merger stages”’, 
when the galaxies pass within a few kiloparsecs of each other and cause 
tidal torques that increase nuclear gas inflows. 

We simulated a set of mock HST images, targeting redshifted ver- 
sions of our imaging datasets (see Methods) at the peak of black-hole 
growth at z~ 1-2, and found that the HST would miss the majority 
of such systems (7/8) at merger separations below 3 kpc owing to 
insufficient spatial resolution and sensitivity, which is necessary to 
identify the nuclear mergers that we find in our low-redshift sample. 
The upcoming James Webb Space Telescope will provide substantial 
improvements in sensitivity. However, the resolution of such nuclear 
mergers requires the use of adaptive optics systems in the next gen- 
eration of large-diameter ground-based telescopes (for example, the 
Thirty Meter Telescope, the European Extremely Large Telescope and 
the Giant Magellan Telescope). These will reach resolutions of 300 pc 
using adaptive optics at z~ 1-2—scales that are consistent with the 
smallest-separation mergers identified in this study. 

With the discovery of gravitational waves emitted from the merger of 
stellar-mass black holes, interest in understanding gravitational waves 
produced from the merger of supermassive black holes has increased 
considerably. The study of nuclear mergers is therefore critical for 
comparison with cosmological merger-rate models, because it can help 
constrain the timescales for supermassive-black-hole inspiral and the 
rate of such events, which are likely to be found with gravitational wave 
detectors, such as pulsar timing arrays” and the Laser Interferometer 
Space Antenna”’. Predictions of the detection rates for these instru- 
ments are based on parameterizations of the merger rates and the 
supermassive-black hole-population™, but these are highly uncertain 
and vary by orders of magnitude”’. Gravitational-wave observatories 
will also struggle with the localization of the sources, which is possible 
only with a resolution of the order of 10 square degrees”, thus requiring 
a better characterization of their likely precursors. Thus, the study of 
nuclear merger fractions and their correlation with galaxy populations 
can provide crucial benchmarks for models of black-hole inspiral and 
the strength of gravitational-wave signals. 
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Fig. 2 | Fraction of close mergers. a, Fraction of mergers, determined 
using high-resolution images obtained with either Keck adaptive optics or 
the HST. The sample of high-luminosity obscured accreting black holes or 
AGN shows a strong excess of small-separation mergers (<3 kpc). Other 
types show no significant excess compared to inactive galaxies. Error bars 
correspond to lo confidence intervals. b, Results from a suite of gas-rich 
high-resolution hydrodynamical galaxy merger simulations for a range of 
viewing angles. At is the time spent at a separation range, and error bars 
represent the median absolute deviation. Our observed merger fractions 
are consistent with obscured and luminous accreting black holes occurring 
primarily in gas-rich major mergers. 
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METHODS 


Data analysis and sample overview. We selected our sources from the 70-month 
Swift/BAT catalogue, which contains 1,171 sources, 836 of which are accreting 
black holes or AGN. We cross-matched this sample with the Roma Blazar Catalog”” 
to avoid beamed and radio-bright black holes, which have been extensively studied 
in past high-resolution studies. A full list of the observational aspects of all galaxies 
examined is provided in a machine-readable table in Supplementary Information. 

We used the NIRC2 imager and an adaptive optics (AO) system to observe 
96 low-redshift (0.01 < z< 0.075) Swift/BAT-detected black holes with suitable 
tip-tilt stars. The images were taken over nine nights spread between 2012 and 
2014. For bright unobscured AGN, the nucleus was used as the point source for 
the tip-tilt correction. Images were taken in the K, band (effective wavelength, 
Aeff & 2.12 jum) and, when possible in good seeing conditions, in the J and H bands 
(1.25 and 1.63 um, respectively). We used a wide-field camera with a resolution of 
40 mas pixel”! and a 40" field of view. We used a three-point dither pattern that 
avoids a known artefact in the lower left part of the field. For calibration purposes 
we took dark- and flat-field images for each night of observation. 

The data were reduced using a custom JLU python code for NIRC2 reduction. 
The code was modified to ensure that extended galaxy emission features were not 
subtracted from the background using SExtractor”®. The images were combined 
by weighting by the Strehl ratio of each image. 

To increase our sample size we also added 64 BAT-detected AGN that were 
observed with the HST NICMOS or WFC3 cameras. The images were taken with 
the F105W (1.05 j1m) or F160W (1.60 jum) filters, with the majority (62/64, 97%) 
in the F160W band. Individual frames were co-added, corrected for cosmic rays 
and distortion, and registered using the default values in AstroDrizzle. For galaxies 
with z< <0.01, we used the mean value of redshift-independent distance meas- 
urements from the NASA Extragalactic Database, when available. We otherwise 
adopted a cosmology of Qn = 0.3, Q,=0.7 and Hy =70 km s~! Mpc! (Qu, matter 
density; Qa, dark-energy density; Ho, Hubble constant) for all distances computed. 

Because most of the possible galaxy counterparts detected in the images do not 
have spectroscopic data available, we applied two methods to deal with possible 
stellar contamination from foreground stars. Nearby foreground stars and galaxies 
were identified using segmentation maps produced by SExtractor. We first applied 
the stellar classification technique provided by this tool, which uses a neural net- 
work as a classifier to assign the values 0 and 1 to non-stellar and stellar objects, 
respectively. To separate between galaxies and stars, every object with a stellarity 
index below 0.5 was considered as a secondary galaxy. 

Because some of the AO observations relied on tip-tilt stars, the AO images 
typically had more foreground stars than the HST ones, which could lead to possible 
contamination. We therefore used a second technique to measure the number of 
stars in the entire field of view that are brighter than our second nearby source, 
divided by the total area searched in the image. This number was then compared 
to the search area used to find the nearest companion to provide an estimate for 
stellar contamination. We excluded counterparts with contamination likelihood 
greater than 10%, all of which also had a stellarity index below 0.5 and had already 
been excluded using the aforementioned SExtractor stellar classification technique. 

All galaxies classified as extended, with low stellar contamination and within 

2.5 mag (~1/10) of the primary AGN or inactive galactic nucleus were classified 
as counterparts (see Extended Data Figs. 1, 2). 
Inactive-galaxy control sample. We created a large control sample of inactive 
galaxies by aggregating over 20 years of past HST NICMOS and WFC3 surveys 
conducted using the F160W filter. For more massive galaxies, which were not 
well sampled in previous NICMOS surveys, we cross-matched all high-resolution 
HST near-infrared observations with the NASA-Sloan Atlas catalogue”? which 
includes about 42,000 nearby (z< 0.05) massive (M+ > 10°°M.5, where Mz is the 
mass of the Sun) inactive galaxies within the footprint of the Sloan Digital Sky 
Survey. These were typically taken with the HST WFC3 near-infrared camera 
owing to the small field of view of NICMOS. We also cross-matched all nearby 
galaxies (z< 0.05) from the RC3 catalogue’”, which covers the entire sky, with the 
list of HST near-infrared observations. Finally, we cross-matched Version 2.1 of 
the Hubble Source Catalog, which includes all WFC3 near-infrared observations, 
with all nearby galaxies (z< 0.05) from the SIMBAD astronomical database. To 
ensure that our sample included only inactive galaxies, we excluded the 168,941 
AGN in the 13th edition of the Véron-Cetty & Véron catalogue of quasars and 
active nuclei*!. We also excluded any galaxies found in clusters, because of their 
very different environments and generally much higher stellar masses. Our final 
control sample included 385 inactive galaxies, obtained from 37 different HST 
programmes. 

When possible, we used the Hubble Legacy Archive to download post-processed 
images. When these were not available, individual frames were co-added and 
corrected for cosmic rays and distortion, in the same way as the HST observations 
of the BAT AGN in our main sample. The NICMOS images were examined to 
ensure that the smaller field of view covered the nuclear regions without any size- 
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able artefacts after processing. As with our AGN sample, we used the average of 
redshift-independent distance measurements from NASA Extragalactic Database, 
when possible, and applied the same method to detect galaxy counterparts. 
Control sample design. Although our matching procedure resulted in 385 inactive 
galaxies observed with the HST, many of the relevant HST programmes focused on 
nearby (<100 Mpc) inactive galaxies and with lower stellar masses than our AGN 
sample. This matching procedure for the control sample is crucial, as many studies 
have found that the merger activity and fraction depend on the stellar mass*”*7. We 
therefore measured H-band luminosities in both the inactive-galaxy and BAT AGN 
samples, as they are an excellent proxy for the stellar mass, with a small scatter of 
only*4 0.2 dex. For photometry, we used the H-band elliptical aperture magnitudes 
from the 2MASS all-sky survey. 

The merger fraction also depends on the star formation rate (SFR). We therefore 
used the IRAS 60-j1m luminosity as a proxy for the SFR. When the IRAS 60 jum 
luminosity was measured as an upper limit, we used the 70-1m luminosity from the 
Herschel Photodetector Array Camera and Spectrometer or the Spitzer Multiband 
Imaging Photometer. We assumed a conversion factor of 1.15 between the 70-j1m 
luminosity and the 60-\1m luminosity based on the average number of sources in 
the sample with both measurements. 

The inactive-galaxy sample was matched in stellar mass and SFR by excluding 
117 low-stellar-mass galaxies ((log(Ly/Lo)) <9.7; Ly, H-band luminosity; Lo, 
luminosity of the Sun) and 95 low-SFR galaxies (log(VLV) 601m < 43; VLv is the 
luminosity expressed in units of erg s~! and Lv is the monochromatic specific 
luminosity per unit of frequency, ). We note that although these 210 inactive 
galaxies were excluded from the analysis, we did not find any nuclear (R < 3 kpc) 
mergers among them. 

The sample of lower-luminosity black holes has a lower average H-band lumi- 
nosity ((log(Ljy/Lo)) =9.9) than the inactive control sample ((log(Ly/Lo)) = 10.1) 
or the sample of obscured luminous black holes ((log(Ly/La)) = 10.1). The 
sample of unobscured luminous black holes has slightly higher luminosity 
((log(Ly/Le)) = 10.2) than the inactive control sample; however, for unobscured 
black holes, the light can contribute the majority of the emission even in the 
near-infrared bands*°, so the H-band luminosity may overestimate the stellar 
mass. As for the SFR, we find (log(VLV) 60m) = 44.1 for the inactive galaxies, and 
the same value is obtained for the obscured luminous black holes and for all lumi- 
nous black holes. This value is higher than that of the low-luminosity black holes 
((log(VLV) 60m) = 43.6). 

A summary of the different programmes used in the HST control sample, 
based on their titles and descriptions, is provided in Extended Data Fig. 5 and ina 
machine-readable table in Supplementary Information. Most of the control sample 
was obtained from studies of star formation in luminous infrared galaxies or large 
samples of nearby galaxies (70%, 122/175). Nearly all of the inactive-galaxy images 
from the HST were taken with the F160W filter (172/175, 98%), with the remaining 
images taken with the F110W filter. The average image resolution typically 
corresponded to a full-width at half-maximum (FWHM) of 0.19” for the inac- 
tive-galaxy sample, slightly lower than that of the sample of accreting black holes 
(FWHM = 0.12”). However, because the nearby inactive galaxies were typically at 
lower redshift ((z) = 0.021) than the black-hole sample ((z) = 0.034), the physical 
scales probed for inactive galaxies ((FWHM) =79 pc) were actually smaller than 
those for the black-hole sample ((FWHM) =97 pc), particularly for the obscured 
and unobscured luminous black holes ((FWHM) = 134 pc). The average Strehl 
ratio of the black-hole sample was 0.45, mainly owing to the low Strehl ratios in 
the AO sample, whereas the inactive galaxies selected solely from HST data had 
a higher Strehl ratio of 0.9. However, because our study focuses on identifying 
secondary nuclei that are within 2.5 mag of the bright black holes at the galaxy 
centres, the reduced sensitivity to very faint objects with AO does not affect our 
analysis or conclusions. 

We also tested whether the parent sample of inactive galaxies from which the 
matched HST sample was drawn was itself representative, in a statistical sense, 
of the nearby-galaxy population. For the sample of nearby galaxies we used data 
from the Sloan Digital Sky Survey Data Release 7°°. We used spectroscopic red- 
shifts from the New York Value-Added Galaxy Catalog” to limit the sample to the 
0.01 <z<0.05 range to match the control sample. We extracted stellar masses and 
SFR measurements from the Max Planck Institute for Astrophysics-John Hopkins 
University (MPA JHU)**"? catalogue, which contains data based on photometry 
and emission-line modelling. We only used sources that are flagged as galaxies 
in the MPA JHU catalogue (about 90,000). To convert the 60-j1m emission of 
our HST sample to SFR we assumed standard galaxy templates*®. We found that 
the HST sample was representative of the nearby-galaxy population in terms of 
SFR, except for an excess of high-stellar-mass, high-SFR galaxies related to the 
large programmes that study luminous infrared galaxies (which are usually found 
among such high-stellar-mass, high-SFR galaxies in the nearby Universe)*'. As 
galaxy mergers are thought to be correlated with increased star formation, the lack 
of nuclear mergers in these inactive galaxies is very surprising and strengthens 
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our findings of an excess of nuclear mergers among the obscured luminous AGN 
population compared to the control sample. 

Finally, when comparing merger fractions between samples we used the bino- 

mial proportion confidence intervals, which are typically used to compare the 
fractions of different samples. The normal approximation interval is the simplest 
formula; however, for situations with a fraction very close to zero or small numbers, 
this formula is unreliable” and may considerably underestimate the uncertainties. 
We therefore used the Jeffreys confidence interval to provide more reliable error 
estimates and Fisher's exact test to calculate the P value for the difference between 
the two sample fractions. 
Simulations of galaxies at high redshift. We simulate the systematics of studying 
AGN at the peak of black-hole growth at higher redshift (z~ 1) by artificially 
redshifting our imaging data to mimic the quality of the HST data at this redshift, 
following ref. *. Because of the low redshift of our samples (z= 0.04), the physical 
resolution of our ground-based Pan-STARRS images is equal or superior to the 
HST data for a z~ 1 sample. The images also have complete wavelength coverage 
in the g, 7, i, zand y filters (Aeg=4,776 A, 6,130 A, 7,485 A, 8,658 A and 9,603 A, 
respectively), and thus we can properly consider the rest-frame wavelengths and 
spectral energy distributions of the artificially redshifted datasets. We assume that 
the redshifted galaxies are located at z= 1 and are observed in the WFC3 F160W 
band, achieving the same depth as the CANDELS and GOODS-S surveys“. The 
FERENGI algorithm* is then used to determine the best-fit rest-frame spec- 
tral energy distribution templates using the kcorrect routine and to calculate the 
expected flux in the WFC3 F160W band. Finally, the output spatial flux distribu- 
tion is convolved with the point-spread function of the WFC3 F160W band, and 
a noise frame is added using a blank region extracted from the CANDELS and 
GOODS-S surveys. 

Our simulated observations (Extended Data Fig. 7) show that the HST can 

resolve only one of these late-stage mergers with tight double nuclei. This is not 
surprising, given the stark difference in physical scales between z~ 0.04 and z~ 1 
(Da(z=0.04)/Da (z= 1) ¥ 10, where Dag is the angular-size distance at a given redshift) 
and the detection being based on the visibility of tight double nuclei. Although 
our asymmetric/disturbed structures at the outskirts of the galaxies may still be 
visible at z~ 1 in a couple of cases with a considerably increased level of brightness 
owing to the increase in star formation activity at high redshift, the interpretation 
of these structures can be ambiguous, given that the morphology of high-redshift 
star-forming galaxies is often intrinsically less regular than nearby (<250 Mpc) 
galaxies. 
Simulations of merging galaxies. We use high-resolution galaxy merger simula- 
tions performed with GADGET, a smoothed-particle hydrodynamics and N-body 
code that conserves energy and entropy and uses sub-resolution physical models 
for radiative heating and cooling, star formation, supernova feedback, metal 
enrichment and a multi-phase interstellar medium“. Black holes are modelled as 
gravitational ‘sink particles that accrete gas via an Eddington-limited, Bondi- 
Hoyle-like prescription. Thermal AGN feedback is included by coupling 5% of the 
accretion luminosity (Lj; = €;qfc”) to the surrounding gas as thermal energy, 
with a variable radiative efficiency € raq at low accretion rates*” M (where c is the 
speed of light in vacuum). 

Our simulation suite includes seven major-merger simulations with galaxy mass 
ratios of 0.5 or 1. Each of the galaxies have a dark-matter halo, a disk of gas and 
stars (with initial gas fractions of 0.1-0.3), a stellar bulge-to-total ratio of 0 or 0.2, 
anda central black hole with initial mass scaled to the stellar bulge**. The fiducial 
baryonic gravitational softening length and mass resolution are €ogray = 48 pc and 
my =2.8 X 10°Mo, respectively. We also run two simulations at ten-times-higher 
mass resolution to ensure that our results are not resolution-dependent. We stress 
that these details are not crucial for the purpose of the present work, where the 
simulations are used to assess the relative timescales in which merging galactic 
nuclei and black holes can be seen at various separations. 

We also conduct radiative-transfer simulations in post-processing with the 
three-dimensional, polychromatic, dust radiative-transfer code SUNRISE”. 
This publicly available code has been used extensively with GADGET-3 to model 
a wide range of isolated and merging galaxy populations®!**. Stellar emission is 
calculated from age- and metallicity-dependent STARBURST99 spectral energy 
distributions for each stellar particle, and emission from H 1 regions (including 
dusty photo-dissociation regions) around young stars is calculated using the 
MAPPINGS III models**. We implement an AGN spectral energy distribution 
based on the black-hole accretion rate, and our fiducial model is based on empir- 
ically derived, luminosity-dependent templates™®. 

After the dust distribution is calculated with SUNRISE from the gas-phase 
metal density distribution, we use SUNRISE to perform Monte Carlo radiative 
transfer through the dust grid, by computing the energy absorption (including 
dust self-absorption) and thermal re-emission to produce the emergent, spatially 
resolved ultraviolet-to-infrared spectral energy distributions. For each merger 


simulation, we run SUNRISE on snapshots at 10-Myr intervals during the merger 
phase (R < 10-30 kpc) and post-merger phases, and at 100-Myr intervals during 
the early-merger phase, for seven isotropically distributed viewing angles, and the 
result is converted to the merger fraction that would be seen if observed from a 
single direction. 

Code availability. The custom NIRC2 reduction software is available at https:// 
github.com/jluastro/JLU-python-code/tree/master/jlu. 


Data availability 

The reduced imaging datasets from the HST are available from the Hubble Legacy 
Archive. The raw imaging datasets from the near-infrared adaptive optics pro- 
grammes are available from the Keck Observatory Archive. Other reduced datasets 
generated or analysed in this study are available from the corresponding author 
on reasonable request. 
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Extended Data Fig. 1 | Other close mergers. a-d, Tricolour optical Mrk 975 (d) from the AGN sample. The images are 60 kpc x 60 kpc in size. 
images in the gri band from the Sloan Digital Sky Survey or the Kitt Red squares indicate the size of the zoomed-in AO image on the right. 
Peak survey with about 1” angular resolution. The galaxies shown are e-h, High-spatial-resolution images of the nuclear mergers shown in a-d, 


NGC 6240 (a), 2MASX J00253292+6821442 (b), ESO 509-G027 (c) and 4kpc x 4 kpc in size. 


© 2018 Springer Nature Limited. All rights reserved. 


LETTER 


12.4” or 15 kpe 


14” or 15 kpc 1.4” or 1.5 kpc 


2.3” or 1.5 kpc 


2.8” or 1.5 kpc 


Extended Data Fig. 2 | Other close mergers. a—c, Tricolour optical inactive-galaxy sample. d, Lower-quality red Digitized Sky Survey image 
images in the gri band from the Sloan Digital Sky Survey or the Kitt of UGC02369 NEDO1, for which no higher-quality imaging exists. The 
Peak survey with about 1” angular resolution. The galaxies shown are images in a—d are 60 kpc x 60 kpc in size. Red squares indicate the size of 
2MASX J16311554+2352577 (a) and 2MASX J08434495+3549421 (b) the zoomed-in AO image on the right. e-h, High-spatial-resolution near- 
from the AGN sample and 2MASX J08370182-4954302 (c) from the infrared images of the nuclear mergers shown in a-d, 4 kpc x 4 kpc in size. 
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58” or 15 kpc 4.6” or 1.5 kpc 
Extended Data Fig. 3 | Inactive-galaxy control sample. a-d, Tricolour 60 kpc x 60 kpc in size. Red squares indicate the size of the zoomed-in AO 
optical images in the gri band from Pan-STARRS imaging with about1” image on the right. e-h, High-spatial-resolution near-infrared images of 
angular resolution. The images show inactive galaxies in the control the nuclear mergers shown in a-d, 4 kpc x 4 kpc in size. Some white lines 
sample that were matched in stellar mass and SFR to the AGN: NGC 214 are present in NICMOS and Pan-STARRS imaging owing to bad pixels 
(a), NGC 151 (b), NGC 2998 (c) and NGC 6504 (d). The images are with very low or zero response or with very high or erratic dark current. 
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Extended Data Fig. 4 | Stellar mass, star formation rate and resolution 
of AGN and inactive galaxies. a, H-band luminosity of the different AGN 
and inactive galaxies. Inactive galaxies with considerably lower stellar 


masses than the AGN samples were excluded ((log(Ly/Lo)) < 9.7). 


b, 60-\1m luminosity of the different AGN and inactive galaxies. Inactive 


galaxies with lower SFR were also excluded from the comparison 


((log(VLy) 60m) = 43.6). For observations in which a galaxy was not 


detected, we show a 30 upper limit of the SFR, indicated by a downward 
arrow. c, Comparison of the maximum spatial resolution (in parsecs) 

of the different observations. The inactive-galaxy sample typically has 
higher physical spatial resolutions than the AGN samples. Many galaxies 
observed fall along a line because of the constant physical resolution of the 
HST. 
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HST control sample. The majority of archival control sample observations _ with large or small black holes (‘Black Hole’) and elliptical galaxies 
are of high-SFR luminous infrared galaxies (“LIRG’) or from studies (‘Elliptical’). Finally, some nearby galaxies were observed serendipitously 
of volume-limited samples of nearby galaxies (“Nearby Galaxy’). The in observations of other sources or survey fields (‘Serendip’). 


remaining samples originate from observations of spiral galaxies (‘Spiral’), 
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Extended Data Fig. 6 | SFR and stellar mass. Measurements of SFR (SDSS) is shown with grey shading and the full distribution of the HST 
and stellar mass for the BAT AGN sample (purple circles) and the HST- archive with blue contours. The HST archival sample has an excess of 
matched archival control sample of inactive galaxies (green diamonds). high-stellar-mass, high-SFR inactive galaxies because of the large number 


The full distribution of inactive galaxies from the Sloan Digital Sky Survey _ of observations of luminous infrared galaxies. 
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Extended Data Fig. 7 | Simulated HST images of nuclear mergers HST would be unable to detect these final stage mergers. All simulated 
at high redshift. Simulated images of three nuclear mergers images are displayed in the arcsinh scale in coupled-channel-device 
(2MASX]J 01392400+2924067, CGCG 341-006, MCG+02-21-013) counts, as if observed in the HST F160W filter as part of the CANDELS 
observed at z= 1 with the HST F160W filter as part of the CANDELS survey. 


survey (60 mas pixel ~') using optical imaging and FERENGI software. The 
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Extended Data Table 1 | Galaxies with companions within 10 kpc 


Galaxy Class d(") al 


MCG+02-21-013 LumObs 


Stellar | Diff 
Contam | mag 
(%) 


NGC 6240 LumObs 
2MASXJ08370182-4954302 | Inactive 
2MASX LowLum 
J00253292+6821442 

CGCG341-006 LumObs 
UGC02369NED01 Inactive 
2MASXJ01392400+2924067 | LumObs 
Mrk975 LumUnob 


2MASXJ163115544+2352577 | LumObs 
2MASXJ08434495+3549421 | LumObs 


ESO099-G004 Inactive 
MCG+12-02-001 Inactive 
NGC985 LumUnob 
IRAS23436+5257 Inactive 
MCG-02-33-098 Inactive 
Mrk739E LowLum 
NGC6090NED02 Inactive 
NGC3588NED01 LowLum 
Mrk463 LumObs 
2MASXJ06094582-2140234 | Inactive 
Mrk423 LowLum 
2MASXJ05442257+5907361 | LumObs 
IRAS21101+5810 Inactive 
IIZWO96NEDO2 Inactive 
IRASF03359+1523 Inactive 
NGC7212NED02 LowLum 
Was49b LumObs 
NGC2672 Inactive 
UGC04881 Inactive 


2MASXJ17085915+2153082 | LumUnob 


The table lists the sources found to have counterparts within 10 kpc. Obscured and unobscured AGN are separated using the presence of broad HB lines in their optical spectra, and the separation 
etween low- and high-luminosity AGN (below or above L,,;=2 x 10*4 erg s~!, respectively) is based on their X-ray emission (‘LowLum’, low-luminosity AGN; ‘LumUnob’, luminous unobscured AGN; 
‘LumObs’, luminous obscured AGN). The separation d between the two galaxy nuclei is given in arcseconds and kiloparsecs. The stellar contamination (‘Stellar Contam’) indicates the likelihood of 

a stellar source of this brightness occurring randomly in the same search area. Finally, the measured stellarity index from a neural net (‘Stell.’) and the difference (in mag) between the primary and 
secondary galaxies in the merger (‘Diff mag’) are also listed. 
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Observation of universal dynamics in a spinor Bose 


gas far from equilibrium 


Maximilian Priifer!*, Philipp Kunkel!, Helmut Strobel', Stefan Lannig!, Daniel Linnemann, Christian-Marcel Schmied!, 


Jiirgen Berges”, Thomas Gasenzer! & Markus K. Oberthaler! 


Predicting the dynamics of quantum systems far from equilibrium 
represents one of the most challenging problems in theoretical 
many-body physics!”. While the evolution of a many-body system 
is in general intractable in all its details, relevant observables can 
become insensitive to microscopic system parameters and initial 
conditions. This is the basis of the phenomenon of universality. 
Far from equilibrium, universality is identified through the 
scaling of the spatio-temporal evolution of the system, captured 
by universal exponents and functions. Theoretically, this has 
been studied in examples as different as the reheating process 
in inflationary Universe cosmology, the dynamics of nuclear 
collision experiments described by quantum chromodynamics”™®, 
and the post-quench dynamics in dilute quantum gases in non- 
relativistic quantum field theory’'’. However, an experimental 
demonstration of such scaling evolution in space and time in a 
quantum many-body system has been lacking. Here we observe the 
emergence of universal dynamics by evaluating spatially resolved 
spin correlations in a quasi-one-dimensional spinor Bose-Einstein 
condensate!?-!®, For long evolution times we extract the scaling 
properties from the spatial correlations of the spin excitations. 
From this we find the dynamics to be governed by an emergent 
conserved quantity and the transport of spin excitations towards 
low momentum scales. Our results establish an important class 
of non-stationary systems whose dynamics is encoded in time- 
independent scaling exponents and functions, signalling the 
existence of non-thermal fixed points’®'”!®. We confirm that 
the non-thermal scaling phenomenon involves no fine-tuning of 
parameters, by preparing different initial conditions and observing 
the same scaling behaviour. Our analogue quantum simulation 
approach provides the basis with which to reveal the underlying 
mechanisms and characteristics of non-thermal universality classes. 
One may use this universality to learn, from experiments with 
ultracold gases, about fundamental aspects of dynamics studied 
in cosmology and quantum chromodynamics. 

Isolated quantum many-body systems offer particularly clean set- 
tings for studying fundamental properties of the underlying unitary 
time evolution!®. For systems initialized far from equilibrium, different 
scenarios have been identified, including the occurrence of many-body 
oscillations” and revivals”!, the manifestation of many-body locali- 
zation’, and quasi-stationary behaviour in a prethermalized stage of 
the evolution”. 

Here we observe a new scenario associated with the notion of 
non-thermal fixed points. This is illustrated schematically in Fig. 1a: 
starting from a class of far-from-equilibrium initial conditions, the 
system develops a universal scaling behaviour in time and space. This 
is a consequence of the effective loss of details about initial conditions 
and system parameters long before a quasi-stationary or equilibrium 
situation may be reached. The transient scaling behaviour is found to be 
governed by the transport of an emergent collective conserved quantity 
towards low momentum scales. 


For our experimental study we employ an elongated Bose-Einstein 
condensate of about 70,000 ®’Rb atoms. We use the F=1 hyperfine 
manifold with its three magnetic sublevels mp=0, +1 as a spin-1 sys- 
tem with ferromagnetic interactions”. Initially, all atoms are prepared 
in the mp=0 sublevel, forming a spinor condensate with zero spin 
length. The dynamics is initiated by instantaneously changing the 
energy splitting of the F= 1 magnetic sublevels by means of microwave 
dressing (see Methods). Consequently, spin excitations develop in the 
F,-F, plane” as sketched in Fig. 1b. Our experimental setup allows the 
extraction of the spin distribution in terms of the spin component 
E.(y) = [bed (Y) Chya(y) +1 (y)) th.c.]//2 where 71 (y) is the 
creation operator of an atom in the magnetic sublevel m at position 
y and h.c. denotes the Hermitian conjugate. At a given time ¢ this is 
achieved by a spin rotation from the F,—F, plane to the F,-direction 
and subsequently detecting the atomic density difference 
Fy) =n41(y)—n_1(y) (see Methods for details). Representative 
absorption images are shown in Fig. 1c together with the extracted spin 
profiles (green lines). The histograms in Fig. 1c show the probability 
distribution of F, for all positions y and experimental realizations for 
the corresponding evolution time (see Extended Data Fig. 1 for all 
evolution times). Results are presented for characteristic stages associ- 
ated with the initial condition (1), the nonequilibrium instability 
regime (2), the universal scaling regime (3) and the departure from the 
non-thermal fixed point (4), as also indicated in Fig. la. 

We find that during the time evolution the angular orientation 0 of 
the transverse spin (see Fig. 1b) becomes the relevant dynamical degree 
of freedom. For short evolution times unstable longitudinal spin modes 
grow exponentially”, well described by Bogoliubov theory, but non- 
linear evolution quickly takes over (after about 100 ms). This leads to 
a double-peaked structure of the histograms (see Fig. 1c) indicating 
that the spin has a mean length and a random orientation in the F,—Fy 
plane. On the basis of this observation we extract the mean spin length 
(|F (t)|), where F, =F, + iF,, and its fluctuations using a fit. Building 
on that knowledge, we extract the local angle from the profiles as 
O(y, t) =arcsin(F,(y, t)/(|F | (£)|)) (see Methods for details). 

The time evolution of the fluctuations of the spin orientation is 
described in terms of correlation functions of the scalar field 6(y,t). The 
fluctuations are analysed by evaluating the two-point correlation func- 
tion C(y,y’;t) = (A(y,t)O(y,t)). To distinguish the role of different length 
scales we consider a momentum-resolved picture of the dynamics. 
Hence we evaluate the structure factor, which is the Fourier transform 
of Cly, y! ;t) with respect to the relative coordinate j = y' — y, averaged 
over y: 


fylkt)= f [ayavoyy +F,yst)exp( — i2nky) (1) 


In general, the structure factor fy is a function of momentum k which 
evolves in time f in a way determined by the system parameters and 
initial conditions. In Fig. 2a, we plot fo(A,f) as a function of k on a double- 
logarithmic scale for times between 4s and 9s. A characteristic shift of 
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Non-thermal 
fixed point 


Universality 


Initial 
conditions 


Fig. 1 | Universal dynamics and experimental procedure. a, Starting 
from a class of far-from-equilibrium initial conditions, universal 
dynamical evolution indicates the emergence of a non-thermal fixed point. 
Experimentally, we probe the system at different evolution times during 
the stages indicated by numbers 1 to 4. b, A condensate is prepared in the 
mp =0 state of the F= 1 hyperfine manifold, that is, with a vanishing mean 
spin length (left spin sphere). With microwave dressing (see Methods) 

we initiate spin-exchange dynamics, which leads to a growth of spin 
orthogonal to the magnetic field B in the F,—F, plane (right spin sphere). 
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Fig. 2 | Scaling in space and time at a non-thermal fixed point. 

a, Structure factor fo(k,t) as a function of the spatial momentum k= 1/) in 
the scaling regime between 4s and 9 s. The colour indicates the evolution 
time t. The statistical error is of the order of the size of the plot markers. 

In the infrared the structure factor shifts in time to smaller k (bigger 
wavelengths), which is connected to transport of excitations towards lower 
momenta. Characteristic for the non-thermal fixed point dynamics is the 
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Atomic density 
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Subsequently, spatial structures of the spin orientation @ are found along 
the cloud. c, Exemplary absorption images of the three hyperfine levels 
taken after a 1/2 spin rotation and Stern—Gerlach separation together with 
the inferred local spin F,(y) (green lines). Furthermore, histograms for 
around 160 experimental realizations are shown. In the universal regime 
(see step 3 in panel a) we extract the spin length and its fluctuation by a 

fit to the double-peaked structure of the histogram, as indicated in the 
corresponding plot (see Methods). 
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rescaling of the amplitude with universal exponent a and rescaling of the 
length scale with (3 (see inset). b, By rescaling the data with trep= 4.5 s, 
a=0.33 and 3=0.54 the data collapses to a single curve. We parameterize 
the universal scaling function with f, x 1/[1 + (k/k,)‘]. Using a fit (grey 
solid line) we find ¢~ 2.6 and k, + 1/133 pm~!. The quality of the rescaling 
is revealed by the small and symmetric scatter of the rescaled data divided 
by the fit (see inset). 
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Fig. 3 | Characterization of the scaling regime. a, For each evolution 
time (see Fig. 2) we extract the power-law exponent ¢ from a fit. After 
4s it settles to about 2.6 (red solid line), revealing the build-up of the 
universal scaling function. The grey-shaded region indicates the scaling 
regime. b, The transport to the infrared in the scaling regime is connected 
to a monotonic increase of the occupation of k= 0. The solid line depicts 


the structure factor towards smaller momenta as well as an increase of 
the low-momentum amplitude with time is observed. 

In fact, instead of separately depending on k and f we find that in this 
regime the datasets collapse to a single curve if the rescaled distribution 
t~°fy is plotted as a function of the single variable tk. This implies that 
the data satisfy the scaling form 


Pena ew (2) 


with universal scaling exponents a, / and scaling function f,. Figure 2b 
shows this collapse, where the same data points as in Fig. 2a are plot- 
ted with times normalized to the reference time trep=4.5 s. The ability 
to reduce the full nonequilibrium time evolution of the correlation 
function in the scaling regime to a time-independent, so-called fixed- 
point distribution f,(k) and associated scaling exponents is a striking 
manifestation of universality. 

We find the amplitude scaling exponent to be a = 0.33 + 0.08 and the 
momentum scaling exponent to be G=0.54 + 0.06. The errors corre- 
spond to one standard deviation obtained from a resampling technique 
(see Methods). However, the actual uncertainty for a is expected to be 
larger since the rescaling analysis is much less constraining on a than 
on (3. We find that f9(k,t) develops a plateau at the lowest momenta and an 
approximate power-law fall-off above a characteristic length scale in the 
scaling regime. To parameterize the universal scaling function, we fitted 
the rescaled data with a function of the form”®: f,(k) x1/[1 + (k/k,)°] 
and find (2.6, with k,~ 1/133 m7! for our system. The value of 
¢ becomes constant after about 4 s (see Fig. 3a). Analysing fo(k =O, t) 
as shown in Fig. 3b reveals that the occupation of k=0, which cannot 
be seen on the logarithmic scale employed in Fig. 2, builds up in the 
scaling regime. This growth is consistent with the power law propor- 
tional to t with a obtained from the rescaling analysis, as indicated by 
the solid line. After 9 s the system departs from the scaling behaviour. 

The nature of the observed scaling phenomenon is explained by 
the emergence of an approximately conserved quantity and its trans- 
port. In terms of our dynamical degree of freedom O(y,t) we identify 
Jdk(\0(k,t)|?) = {dk fo(k,t) as the conserved quantity. In fact, Fig. 3c 
shows that the sum over all modes k for different evolution times— 
after a fast initial rise due to the instability—settles around a constant 
within the scaling regime (see also Extended Data Fig. 2). According to 
the scaling (2), Jdk fo(k,t) = 1°? [ak f(k) = const. corresponds to a= 3 
so that in our case only one independent dynamical scaling exponent 
remains. A distinct feature is the transport of the conserved quantity 
directed towards the infrared, corresponding to a positive sign of (3. 
Theoretically it is expected to find the scaling only for momenta smaller 
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the expected scaling fo(k =0,t) x t* with a =0.33. After 9 s a rapid decay 
signals the departure from the scaling regime. c, The emergence of a 
conserved quantity is signalled by the sum over all k-modes of fo(k,t). 
After a fast initial growth this observable is approximately constant in the 
scaling regime and starts to decay after 9 s. 
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Fig. 4 | Robustness of universal dynamics at a non-thermal fixed point. 
a, Absorption images of all three mp components after spin rotation 

with the extracted transversal spin (solid lines) of three different initial 
conditions. The preparations show different initial amplitudes in the 
Fourier transform of the transversal spin. b, All initial conditions lead 

to scaling dynamics. The data shown were obtained in a time window 
between 4s and 9 s after preparation of the initial state. In the inset the 
scaling exponents of all four initial conditions, including the preparation 
in mp=0 (see Fig. 1), are shown; the error bars are 1 s.d., obtained from a 
resampling method (see Methods). The mean values (red and blue solid 
lines) of a and @ are used to rescale the data. We allow for overall scaling 
factors in k and amplitude for each initial condition. 
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than some scale’? (in our case about 0.04 pm}; see Extended Data 
Fig. 3). The transport towards the infrared is in contrast to the turbulent 
transport into the ultraviolet observed in direct cascades”’. 

These experimental findings of scaling behaviour, implying univer- 
sality, allow comparison with predictions in a variety of models in the 
non-thermal universality class, which is defined by the scaling func- 
tion f, and a= d( for given spatial dimension d. N interacting Bose 
gases with equal intra- and interspecies Gross—Pitaevskii couplings 
are described by an O(N) symmetric model. This is closely related to 
O(N) symmetric scalar models”®, such as the relativistic Higgs sector 
of the Standard Model with N=4 for d=3. For these types of models, 
both Gross—Pitaevskii and relativistic, a universal value of 0.5 has 
been predicted and found to be insensitive to the spatial dimension’? 
for d> 2. This describes the self-similar transport of excitations of the 
relative phases between the components to lower wavenumbers. The 
scaling function f, is known to depend on dimensionality” and has not 
yet been theoretically estimated for d=1. Our setup is, to our know!l- 
edge, the first realization of an effective N=3 model for the transport 
of conserved quantities associated with non-thermal fixed points in 
a quasi-one-dimensional situation. Finding scaling behaviour in one 
dimension was not expected and sheds new light on the concept of 
universality classes far from equilibrium. 

We emphasize that the non-thermal scaling phenomenon studied 
here involves no fine-tuning of parameters. This is in contrast to equi- 
librium critical phenomena, which require a careful adjustment of sys- 
tem variables, such as the temperature, to a critical value*’. To illustrate 
this insensitivity we employ the high level of control of the atomic spin 
system and prepare three qualitatively different initial conditions (for 
details see Methods). The corresponding absorption images of single 
realizations are shown in Fig. 4a along with the Fourier transform of 
the spatial correlation function of F,(y). 

We find universal dynamics for all initial conditions with comparable 
inferred scaling exponents (see inset of Fig. 4b). We rescale the data 
with the same exponents obtained from the mean of all four meas- 
urements and take into account overall scaling factors and reference 
momentum scales. This procedure leads to a collapse of all data, man- 
ifesting the robustness of non-thermal fixed point scaling. 

The level of control demonstrated here and the accessible observables 
on our platform open the door to the discovery of further non-thermal 
universality classes. This represents a crucial step towards a compre- 
hensive understanding of out-of-equilibrium dynamics with potential 
impact in various fields of science. 

(We note that similar phenomena have recently been observed by 
the Schmiedmayer group”! in Vienna in a single-component Bose gas, 
where a scaling exponent 30.1 was extracted.) 


Online content 

Any methods, additional references, Nature Research reporting summaries, source 
data, statements of data availability and associated accession codes are available at 
https://doi.org/10. 1038/s41586-018-0659-0. 
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METHODS 


Microscopic parameters. The dynamics of the spinor Bose gas is described by 
the Hamiltonian 


es ae ‘ 
Oe + OE +B, +E): + q(Ay +f) + pk (3) 


H=h tfav 


where fi, = yp. with pt the bosonic field creation operator of the magnetic 
substate m € {0,+1}, and : : denotes normal ordering. Hy) contains the spin- 
independent kinetic energy and trapping potential. The spin operators are given 
by R= (ht (bh, +b )+hc.]/J2 and f= [id t(h,,—b,)t+hc.]/V2 and 
E,=n,,—n_,. The parameter p describes the linear Zeeman shift in a magnetic 
field. For the hyperfine spin F= 1 of ®’Rb the spin interaction is ferromagnetic, 
that is, c) << 0. 

For the experimental control parameter q > 2n|c;|, with n being the total den- 
sity, the mean-field ground state is the polar state, which corresponds to all atoms 
occupying the mp=0 state. In the range 0 < q < 2n|c,| a spin with non-vanishing 
length in the x-y plane is energetically favoured (easy-plane ferromagnet)*”. This 
is the parameter regime employed in the experiment. 

Experimental system. We prepare a Bose-Einstein condensate of about 70,000 
atoms in the state (F,mp) = (1, 0) in an optical dipole trap of 1,030 nm light with 
trapping frequencies wy 2n x 2.2 Hz and w, +27 x 250 Hz. 

The control parameter q is given by q=4z — qmw, where gg* 27 x 56 Hzis the 
second-order Zeeman splitting in a magnetic field of B~ 0.884 G and quw= 27/46 
is the energy shift due to the microwave dressing. For dressing*? we use a power- 
stabilized microwave generator with resonant Rabi frequency Q 27 x 5.3 kHz 
and 627 x 137 kHz blue detuned with respect to the (1, 0) < (2, 0) transition. 
For the spin dynamics we adjust Q and 6 so that q~ n|c;| (with nc ~ —2n x 2 Hz). 
To monitor the long-term stability of q we do a reference measurement every 4h 
(corresponding to about 250 experimental realizations). For this we observe spin 
dynamics for a fixed evolution time of 4 s as a function of the control parameter 
q (changing the detuning 6). Analysing the integrated side mode population we 
infer that the drifts of q are well below 0.5 Hz. 

Preparation of different initial conditions. We prepare three initial conditions 
(see Fig. 4) that differ from the polar state. For initial condition 1 the control 
parameter is first set to qn|c;|+1 Hz. After 500 ms of spin dynamics at this 
value we quench to the final value q ~ n|c,|. For the preparation of initial condition 
2 we apply a resonant 7/5 radio-frequency pulse to populate the (1,-£1) states. 
After a hold time of 100 ms at a magnetic field gradient of around 0.2 1G jm! in 
the longitudinal trap direction we apply a second 17/5 radio-frequency pulse. The 
combination of q and an inhomogeneous p during the hold time leads to a spatially 
modulated transversal spin on a length scale of A 80 jum. For initial condition 3 


LETTER 


we populate homogeneously the (1, +1) states with a short radio-frequency pulse 
such that (n4, + n_)/n#0.1. 

Spin read-out. The spin dynamics is initiated by quenching the control parameter. 
After a fixed evolution time t we apply a short magnetic field gradient pulse (Stern- 
Gerlach) in the z-direction and switch off the waveguide potential. Following a 
short time of flight (about 1 ms) we perform high-intensity absorption imaging 
with a resonant light pulse of duration 15 1s. The resolution of the imaging system 
is about 1.2 j1m, corresponding to three pixels on the charge-coupled-device 
camera™; we accordingly bin the spin profiles by three pixels. As our Stern—Gerlach 
analysis is oriented in the z-direction, for the read-out of the spin in the x-y plane 
we apply, before the magnetic field gradient, a radio-frequency pulse resonant with 
the transitions (1, 0) < (1,+1). 

The radio-frequency pulse can be modelled as a spin rotation described by the 
Hamiltonian H,,= Q,;F, with resonant Rabi frequency Q,¢ 2 x 17.5 kHz. 
Applying a 1/2-pulse of duration 14.3 1s, the observable F. is mapped to the 
measurable density difference n; —n_1. 

Inferring the spin orientation. The double-peaked spin distributions in the 
scaling regime (see Extended Data Fig. 1) resemble a distribution of a transversal 
spin with random orientation. To extract the corresponding ensemble average 
length (|F\|) of the transversal spin and its fluctuation o we fit a probability 


density of the form p(F) oc 1/,/1 — (F,/(|F,|))? convolved with a Gaussian distri- 
bution with root-mean-square o. Under the assumption of a homogeneous 
spin length the spatial profile of the angular orientation is given by 
Oy) = arcsin(F,(y)/(|F,|)). If the maximal amplitude is larger than (|F,|) — o we 
use the maximal amplitude of the single realization instead of (|F,|). 

Extraction of scaling exponents. After rescaling the results of the discrete Fourier 
transform according to equation (2) we interpolate with cubic splines to obtain a 
common k-grid for all evolution times. We vary the scaling exponents a and (3 to min- 
imize the sum of the squared relative differences of all structure factors fp. To estimate 
the statistical error on the exponents we employ a jackknife resampling analysis. 


Data availability 
The data presented in this paper are available from the corresponding author upon 
reasonable request. 
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Extended Data Fig. 1 | Spin distributions for all evolution times. 

a, The panels show the distributions of the transversal spin, F,, measured 
at different evolution times as indicated. Initially, we find a narrow 
Gaussian distribution corresponding to the prepared coherent spin state. 
The excitations developing in the transversal spin lead to a double-peaked 
distribution within the interval of 2 s to 10 s. For long evolution times, 


© Spin length 
@ Fluctuations 
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t> 12s, the distribution resembles a Gaussian, which is much broader 
than the initial distribution. b, The spin length and its root-mean- 

square fluctuation as a function of evolution time are extracted by a fit 
(see Methods). We find a slow decay of the spin length and nearly constant 
root-mean-square fluctuations in the scaling regime. 


© 2018 Springer Nature Limited. All rights reserved. 


LETTER 


Fourier transform of transversal spin 


10° 10 
Spatial momentum k (1/"m) 


Extended Data Fig. 2 | Build-up of transversal spin in momentum 
space. Since the angular orientation @ cannot be extracted reliably for 
short evolution times, we choose to show the Fourier transform of the 
transversal spin for regimes 1-3 (see Fig. 1). The initial condition, all 
atoms prepared in mp=0, is characterized by a flat distribution. There 
is a fast build-up of long-wavelength spin excitations by more than two 
orders of magnitude within the first second. This process is followed by 
a redistribution of momenta leading to the scaling form for times longer 
than 4s. 
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Extended Data Fig. 3 | Scaling of structure factor for all experimentally accessible length scales. Same data as shown in Fig. 2. a, Unscaled data. 
b, Data rescaled with the scaling exponents reported in the main text. The rescaling does not apply for large momenta, k > 0.04 pm7!. 
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Universal prethermal dynamics of Bose gases 


quenched to unitarity 


Christoph Eigen'*, Jake A. P. Glidden!, Raphael Lopes, Eric A. Cornell?*, Robert P. Smith? & Zoran Hadzibabic!* 


Understanding strongly correlated phases of matter, such as the 
quark-gluon plasma and neutron stars, and in particular the 
dynamics of such systems, for example, following a Hamiltonian 
quench (a sudden change in some Hamiltonian parameter, such as 
the strength of interparticle interactions) is a fundamental challenge 
in modern physics. Ultracold atomic gases are excellent quantum 
simulators for these problems, owing to their tunable interparticle 
interactions and experimentally resolvable intrinsic timescales. 
In particular, they provide access to the unitary regime, in which 
the interactions are as strong as allowed by quantum mechanics. 
This regime has been extensively studied in Fermi gases”. The 
less-explored unitary Bose gases*"!! offer possibilities!” such as 
universal physics controlled solely by the gas density!*!4 and new 
forms of superfluidity!>-!”. Here, through momentum- and time- 
resolved studies, we explore degenerate and thermal homogeneous 
Bose gases quenched to unitarity. In degenerate samples, we observe 
universal post-quench dynamics in agreement with the emergence 
of a prethermal state!®*™4 with a universal non-zero condensed 
fraction’, In thermal gases, the dynamic and thermodynamic 
properties generally depend on the gas density and the temperature, 
but we find that they can still be expressed in terms of universal 
dimensionless functions. Surprisingly, we find that the total quench- 
induced correlation energy is independent of the gas temperature. 
These measurements provide quantitative benchmarks and 
challenges for the theory of unitary Bose gases. 

In ultracold atomic gases, two-body contact interactions are char- 
acterized by the s-wave scattering length a, and the unitary regime is 
realized in the limit a — 00, with a tuned using magnetic Feshbach 
resonances». In Bose gases, tuning a to infinity also enhances three- 
body recombination, which leads to particle loss and heating, making 
unitary Bose gases inherently dynamic, non-equilibrium systems. 
Experimentally, these systems are studied by rapidly quenching a to 
infinity (Fig. la), which initiates the non-equilibrium dynamics. If 
starting with a Bose-Einstein condensate (BEC) in the k 0 momen- 
tum state, after the quench the momentum distribution broadens (the 
kinetic energy increases) owing to lossless correlation dynamics and 
to recombination heating (Fig. 1b). The interplay between these two 
processes raises many questions, such as whether the gas attains a 
strongly correlated quasi-equilibrium steady state before degeneracy 
is lost. 

The timescales of the different processes are set by the natural length 
scales of the system. Within the universality hypothesis'4, in a homo- 
geneous degenerate unitary gas the only relevant length scale is the 
interparticle spacing n—7, where n is the particle density, which (in 
analogy with Fermi gases) sets the Fermi momentum hk, = h(6rn)"3, 
energy E,,=h’k; /(2m)and time t, =h/E,, where m is the particle mass 
and h is the reduced Planck constant. Additional, potentially relevant 
length scales are the sizes of the Efimov trimer states that exist as a 
result of resonant two-body interactions'”**?!. Three-body correla- 
tions® and Efimov trimers” have been observed experimentally, but all 


degenerate-gas dynamics have been consistent with t,, being the only 
characteristic timescale”. This universality has so far made it impos- 
sible to disentangle the lossless from the recombination-induced 
dynamics. Experimental evidence has suggested that the lossless pro- 
cesses are faster, sufficiently so that the gas attains a degenerate steady 
state®!9; however, almost nothing could be established about the nature 
of this state. Here we isolate the effects of the lossless post-quench 
dynamics through momentum- and time-resolved studies of degener- 
ate and thermal Bose gases. 

We prepare a homogeneous *’K Bose gas in an optical-box trap!° 
with a volume of around 3 x 10*,1m? and use a Feshbach resonance 
centred at® 402.70(3) G. Initially, we prepare either a quasi-pure BEC 
or a thermal gas. In both cases, we start with a weakly interacting sys- 
tem, with na? < 10~4, then quench the gas to unitarity (within 2 1s) 
and let it evolve for a time tholg; in our box trap, t, is a global variable 
and after the quench all parts of the system evolve in the same way. 
After the time fhoig, we quench the gas back to low a, release it from 
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Fig. 1 | Dynamics of a degenerate Bose gas quenched to unitarity. 

a, Quench protocol. The red circles depict atoms, and their sizes the 
interaction strength, which is limited at unitarity by the interparticle 
spacing; a is the s-wave scattering length and fyo1q is the hold time at 
unitarity. b, Momentum distribution n;(k) for different thoiq values; the 
initial gas density isn =5.1,um °, corresponding to a Fermi momentum of 
k,=6.7 jum! and a Fermi time of t, = 27 1s. c, Populations of individual k 
states show rapid initial growth, saturation at (quasi-)steady-state values of 
7i,(k) (dashed lines) and long-time heating. The error bars reflect 1 s.e.m. 
(not visible when smaller than the symbol size). The solid lines are sigmoid 
fits used to extract the initial-growth half-way times 7(k). 
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Fig. 2 | Universal post-quench dynamics and the steady-state 
momentum distribution in the degenerate Bose gas. a, b, The 
momentum-dependent half-way time 7(k) for reaching the steady state (a) 
and the steady-state momentum distribution 7,(k) (b), for three different 
BEC densities n. Expressing all quantities in dimensionless form, using the 


the trap and measure its momentum distribution n,(k); we normalize 
nx so that 


J4ank’n,dk=1 


See Methods for further experimental details. 

We first present our study of degenerate gases. In Fig. 1b we show 
nx(k) for an initial BEC density of n=5.1 jum~? and various thoig. In 
Fig. 1c we illustrate our key experimental observation. By looking at 
n, values for individual k states, we discern separate stages in the evo- 
lution of nx: after a rapid initial growth, nj; reaches a (quasi-)steady- 
state plateau, before the long-time heating takes over. All timescales 
are of order t,, but distinguishable. We discern such time separation 
for k/k, 2 0.8. For each k in this range, we identify the plateau occupa- 
tion 7, (dashed lines in Fig. 1c) and use sigmoid fits (solid lines) to 
extract the characteristic time 7(k) for the initial rapid growth of n;,, 
defined such that n,(k, 7(k)) = 7,(k) /2. We note that t,, and k,, corre- 
spond to the initial n; for our longest 7 we observe particle loss of 
approximately 20%. 

In Fig. 1c we also see that the curves for different k values are not 
aligned in time; 1,4(2k,) shows signs of heating before n;,(k,) reaches its 
steady-state value. This finding illustrates why lossless and recombina- 
tion dynamics could not be separated quantitatively by considering all 
k at the same evolution time”, such as by looking at the kinetic energy 
per particle E(tyo1q)!°. Instead, we separately obtain 77, for different k 
values and piece together the function 7,(k). Doing so does not give the 
momentum distribution at any specific time, but allows us to infer what 
the steady-state n,(k) would be if the gas did not suffer from losses and 
heating. We assume that at early times (t,,)4 = O(t,,)), all non-zero-k 
states are primarily fed from the macroscopically occupied BEC 
(Fig. 1b). 

In Fig. 2 we plot the dimensionless 7/t,, and 7i,k, versus the dimen- 
sionless k/k,, for three BEC densities. By expressing all quantities in 
dimensionless form, all of our data fall onto universal curves (within 
experimental errors). 

In the experimentally accessible range of momenta, our data are con- 
sistent with the scaling T/t, x ky,/k at low k and T/ty x (ky/k)’ at high k. 
These scalings were predicted for the emergence of a prethermal steady 
state??-74, According to this prediction, at short times after the quench, 
the excitations are similar to the Bogoliubov modes in a weakly inter- 
acting BEC—which are phonons at low k and particles at high k—but 
with the usual mean-field energy replaced by an energy of order E,. 
The speed of sound is then of order hk,/m and the crossover between 
the two regimes is at k = O(k,,). Finally, 7(k) is set by the dephasing 
time, which is given approximately by the inverse of the excitation 
energy. 

The form of the universal 7,k curve was not anticipated and poses 
anew theoretical challenge. Empirically, over three orders of magnitude 
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Fermi time t, and momentum k,, as the natural scales, collapses all of our 
data onto universal curves. The error bars show fitting errors (not visible 
when smaller than the symbol size). The solid line in b is an exponential 
fit, i,k = 1.53exp( — 3.62k/k,,). 


of 7,k;, our data are well captured by a simple exponential, 
Aexp(—Bk/k,), with A = 1.53(5) and B = 3.62(2) (where the errors are 
1o fitting errors). This function implies a condensed fraction of 


fat [artigdk = 19(4)% 


Up to k 3k, we do not observe the asymptotic form n, ~ 1/k* that is 
expected* at very high k; however, even if nz changed to this more 
slowly decaying form immediately outside of our experimental range, 
n would change by less than 3%. Previous theoretical work”*‘ has pre- 
dicted values of 7 in the prethermal state that are close to our estimate, 
but the exponential form of 77,k; has not previously been predicted. 
Explaining this experimental observation may require explicit consid- 
eration of the quench back to low a. 

We now turn to thermal gases, which reveal some simplifications, but 
also more surprises. A simplification is that, while in a thermal gas the 
three-body recombination and the lossless dynamics are both slowed 
down compared to the degenerate-gas case, the three-body recombi- 
nation is slowed down more*>”?. As shown in Fig. 3a, now E(thoia) 
exhibits two separate stages in the post-quench dynamics: a rapid 
initial growth (here for thoigS 100 ps) and long-time heating (for thoia 
> 100s). The shape of the curve is similar to those for individual k 
states in Fig. 1c and the long-time energy growth matches the theory of 
recombination heating*’°. These results reinforce our interpretation of 
the two-step dynamics, both for degenerate and for thermal gases. We 
now focus on the early-time dynamics. As we show in Fig. 3b, m,(k) is 
essentially identical at 601s and 1261s, meaning that on this timescale 
a steady state is established for all k. 

Ina thermal gas, even before the quench to unitarity, n; is substantial 
for all k S 1/A, where \=h/,/2xmk,T is the thermal wavelength, T is 
the initial temperature (before the quench to unitarity), kg is the 
Boltzmann constant and h=27Th. We therefore look at the redistribution 
of particles in k space, in particular, the change 6n;(k) with respect to 
thold= 0 and the corresponding change de in the spectral energy density 
€=h?/(2m) x 4nk*n,. An additional challenge in understanding the 
thermal-gas case is that we have two relevant length scales, n —3 and ), 
and it is not a priori clear whether the dynamic and thermodynamic 
properties can be expressed in terms of dimensionless universal 
functions. 

In Fig. 3c we show time-resolved population changes in different 
spherical shells in k space, 4nk?&n,. For some special ko (dotted line), 
the population remains essentially constant. In Fig. 3d we show vertical 
cuts through Fig. 3c for k< ko, k=kg and k > kp. Away from ko, we use 
sigmoid fits (solid lines) to extract 7(k), both for diminishing and for 
growing populations. Near kg we see only a small wiggle in 61, to which 
we cannot assign a single timescale. 

In Fig. 3e, f we show 7(k) and the steady state 5<(k) for two 
different combinations of n and T. The 6¢(k) curve conveys the 
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Fig. 3 | Thermal Bose gas quenched to unitarity. a, The kinetic energy 
per particle E shows rapid growth for thoia S 100s and substantial heating 
only for thoig >> 100 1s; the black line is the prediction for recombination 
heating. Here, and in b-d, the initial gas density and temperature are 
n=5.6um *and T= 150 nK, respectively. b, Momentum distribution 
n,(k) for different hold times at unitarity. The initial redistribution of 
particles from low k to high k (indicated by the dotted arrow) is essentially 
complete within 601s, and n;, is almost identical at 126 |1s and 601s. 

c, Population changes in different k-space shells, 4nk?5n,(k); the 


redistribution of particles from k < kp to k > ko and the resulting 
energy growth 


AE= | Beak 
The dispersive shape of 7(k) was not anticipated and invites further 


theoretical work. Here, we empirically investigate whether these curves 
can be scaled into universal dimensionless functions. 


thoia (HS) 


a 
-3 a 
n(um-) 7 (nk) g 
0.7 200 & 
° 11 160 
Oo 1.2 100 
° 1.3 70 
° 1.8 80 
° 2.0 190 c 
oO 2.3 100 20 
° 2.7 170 
Oo 3.2 100 
° 3.8 190 
fe) 4.2 150 
ro) 5.0 190 
° 5.1 190 
° 5.6 150 
° 6.7 200 0.2 + 


kA 


Fig. 4 | Universal dynamic and thermodynamic functions for the 
thermal Bose gas quenched to unitarity. a, d, Plotting the half-way time 7 
for reaching the post-quench steady state (a) and the change in the spectral 
energy density 6=/\ (where A is the thermal wavelength; d) versus kA 
horizontally aligns all of our curves for 15 different combinations of the 
initial gas density n and temperature T (see key). The vertical grey line 
corresponds to ko. b, Supposing that the characteristic timescale for the 
dynamics is t, x fete t, where t) = h/(kgT), we obtain the best data collapse, 
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population in ky = 6.0 1m! (dotted line) remains essentially unchanged. 
d, Vertical cuts through the plot in c. Solid lines are sigmoid fits used to 
extract the half-way time 7(k). e, f, 7(k) and the change in the spectral 
energy density (between the initial, pre-quench state and the post-quench 
steady state), S<(k) oc k*8n,(k). Here we show data for n=5.61m~* and 
T= 150 nK (blue) and for n= 1.3 ,1m~? and T=70 nK (red). For the data 
in a and d, 1 s.e.m. error bars are smaller than the symbol size. In e and f, 
the error bars (in most cases smaller than the symbol size) show fitting 
errors. 


For the horizontal scaling we find that the natural scale for k is 1/A, 
independent of n. In Fig. 4a we plot 7(k) versus kA, for 15 combinations 
of nand T (corresponding to phase-space densities nA? between 0.2 
and 2). Similarly, in Fig. 4d we plot 5¢(k)/ A versus kA, so that the area 
under each curve is still AE(n, T). In both cases, we see horizontal 
alignment of all of the curves, with ko = 4.4/A. 

A more challenging question is whether these n- and T-dependent 
curves may be collapsed vertically, by scaling them by some time 


corresponding to the minimum of o/o» (see text for details), for 

ay, 3, 1/2 (dashed cross indicates a;= 3;= 1/2). This suggests that 

t,= ./f,t)- & Similarly, for the energy scale E, x ES*(kgT)"£, we find ag 1 
and 3,0, which suggests that E,= E,. c, f, The dimensionless 7/ al tnty (c) 
and $¢/(XE,,) (f) are, to within experimental errors, universal functions of 
the dimensionless k.. All error bars (not visible when smaller than the 
symbol size) show fitting errors. 
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t.(n, T) and energy E,(n, T). We conjecture that t, «x gag th where t, =h/ 
(kgT), and similarly E, x E,°#(k,T)”®, and determine for which az and 
Pz we get the best collapse. We treat ay, and G;, as independent, but 
physically (if there are no other relevant scales) we expect 
a, t+ B=ag+ Ge=1. 

We quantify the degree of the data collapse by a single number a, 
which is obtained by calculating the standard deviation of the data for 
all n and T at a fixed kX and then summing over kX. In Fig. 4b, e, we 
show plots of o/a for 7 and ¢/A; here, oo corresponds to no scaling. 

For the temporal scaling, in Fig. 4b we find the lowest o near 
y= B= 1/2, which suggests that f, = ./t,,t). In Fig. 4c we plot 7/_/t,t) 
and see that all of our data collapse onto a universal curve (within 
experimental scatter). For this scaling we have an intuitive interpreta- 
tion. In a thermal gas, particles do not overlap, so to feel the unitary 
interactions after the quench they must first meet. The t, that we find, 

Ef, on!/>)\m/h, matches the expected scaling for the characteristic 
time until meeting, which is given by the ratio of the interparticle spac- 
ing n~'? and the characteristic thermal velocity hi/(m4). 

In Fig. 4e we find that the optimal values of ag and (3g are ag 1 and 
Pex, suggesting that E, = E,,. This scaling implies that, surprisingly, 
whereas 5¢(k) naturally depends on n and T, its integral AE is inde- 
pendent of T; in Fig. 4f we see that this scaling collapses all of our data 
onto a universal curve. —_ 

This lack of T dependence suggests that AE/E,, in a thermal gas 
should also be equal to E/E, in a degenerate gas (where AE=E). 
Bearing in mind the caveat that we do not observe very high-k tails 
experimentally, from the data in Fig. 4f we estimate that AE/E,, = 0.7(1) 
for a thermal gas; from the exponential 7,k; in Fig. 2b, we obtain a 
consistent value of E/E, = 0.74(4) for a degenerate gas. 

Our experiments establish a comprehensive view of the prethermal 
dynamics and thermodynamics of homogeneous Bose gases quenched 
to unitarity, at low and high temperatures. They provide quantitative 
benchmarks and new questions for the theory of unitary Bose gases. 
Open problems include explaining the forms of our experimentally 
observed universal dynamic and thermodynamic functions, and 
elucidating the connections between these universal features and 
previously observed signatures®*? of non-universal Efimov physics. 
Experimentally, an important future challenge is to probe the coher- 
ence and the potential superfluid properties of the prethermal state of 
a degenerate unitary Bose gas. 

While this paper was under review, we learned of two other exper- 
iments that observe universality in the many-body dynamics of 
out-of-equilibrium quantum systems***>. 


Online content 

Any methods, additional references, Nature Research reporting summaries, source 
data, statements of data availability and associated accession codes are available at 
https://doi.org/10.1038/s41586-018-0674-1. 
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METHODS 


Optical-box trap and sample preparation. As described previously**”’, our box trap 
is formed by blue-detuned, 532-nm laser beams. It is cylindrical in shape, with a diam- 
eter of about 30,1m and a length of about 45}1m. We deduce n from the measured atom 
number, and take into account the fact that the trap walls are not infinitely steep”, 
owing to the diffraction limit on the sharpness of the laser beams, so the effective trap 
volume depends slightly on the energy per particle in the initially prepared sample. 
Our clouds are in the lowest hyperfine ground state and we initially prepare 
them at a field of approximately 399.1 G. At this field, the scattering length is 
a; ~ 400d, where ag is the Bohr radius. 
Quench protocol and measurement details. At the end of fhoiq we quench a back 
to a, using an exponential field ramp with a time constant of 1 1s. We use the fastest 
ramp that is technically possible to minimize the conversion of atoms into mole- 
cules”. We then release the gas from the trap and simultaneously (within about 
3 ms) completely turn off interactions (a — 0). After letting the cloud expand for 
6-12 ms of time of fight, we take an absorption image of it. We typically repeat 
each measurement about 20 times. To reconstruct n;(k) from the two-dimensional 
absorption images, which give the momentum distribution integrated along the 
line of sight, we average each image azimuthally, then average over the experi- 
mental repetitions, and finally perform the inverse Abel transform. Owing to the 
initial cloud size and non-infinite time of flight, our measurements of n;,(k) are not 
quantitatively reliable for k< 2m. 
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Extrapolation of i,k, in a degenerate gas. We also use our experimental data to 
estimate how the function i,k, extrapolates to lower k/k,, without presuming its 
functional form. For k/k,, < 0.8, we do not see clear steady-state plateaux in 
nNk(thola)» Such as indicated by the dashed lines in Fig. 1c. However, we can extrap- 
olate T x t,k,,/k according to the dashed line in Fig. 2a; then, assuming that heat- 
ing effects are not yet substantial at thoig=7(k) and following our definition of 7, 
we estimate 7, = 2n,(T), where m,(7) is the nj; measured at the extrapolated r. 
These extrapolated values of 7,k? are shown by open symbols in Extended Data 
Fig. 1. They fall on the same exponential curve that fits our directly measured 
values of 7,k; (solid symbols), lending further support for this unexpected func- 
tional form. 


Data availability 

The data that support the findings of this study are available in the Apollo reposi- 
tory (https://doi.org/10.17863/CAM.30242). Any additional information is avail- 
able from the corresponding authors on reasonable request. 


36. Gaunt, A. L., Schmidutz, T. F., Gotlibovych, |., Smith, R. P. & Hadzibabic, Z. 
Bose-Einstein condensation of atoms in a uniform potential. Phys. Rev. Lett. 
110, 200406 (2013). 

37. Eigen, C. et al. Observation of weak collapse in a Bose-Einstein condensate. 
Phys. Rev. X 6, 041058 (2016). 


© 2018 Springer Nature Limited. All rights reserved. 


LETTER 


10°" 
1077 
E 
= 
107° 
e 
10+ $ 
0 1 > 3 


kK/Kn 


Extended Data Fig. 1 | Extrapolation of 7,k, in a degenerate gas to 
lower k/k,,. Solid symbols show directly measured values (also shown in 
Fig. 2b), here combining the data for all three BEC densities. Open 
symbols show experimentally extrapolated values, for all three densities, as 
described in Methods. The solid line is the same as in Fig. 2b. 
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Universal dynamics in an isolated one-dimensional 
Bose gas far from equilibrium 


Sebastian Erne!*%, Robert Biicker!*, Thomas Gasenzer”, Jtirgen Berges? & Jorg Schmiedmayer!* 


Understanding the behaviour of isolated quantum systems far from 
equilibrium and their equilibration is one of the most pressing 
problems in quantum many-body physics’. There is strong 
theoretical evidence that sufficiently far from equilibrium a wide 
variety of systems—including the early Universe after inflation*°, 
quark-gluon matter generated in heavy-ion collisions’~®, and 
cold quantum gases*!*-'4— exhibit universal scaling in time and 
space during their evolution, independent of their initial state or 
microscale properties. However, direct experimental evidence is 
lacking. Here we demonstrate universal scaling in the time-evolving 
momentum distribution of an isolated, far-from-equilibrium, one- 
dimensional Bose gas, which emerges from a three-dimensional 
ultracold Bose gas by means of a strong cooling quench. Within the 
scaling regime, the time evolution of the system at low momenta is 
described by a time-independent, universal function and a single 
scaling exponent. The non-equilibrium scaling describes the 
transport of an emergent conserved quantity towards low momenta, 
which eventually leads to the build-up of a quasi-condensate. Our 
results establish universal scaling dynamics in an isolated quantum 
many-body system, which is a crucial step towards characterizing 
time evolution far from equilibrium in terms of universality classes. 
Universality would open the possibility of using, for example, cold- 
atom set-ups at the lowest energies to simulate important aspects 
of the dynamics of currently inaccessible systems at the highest 
energies, such as those encountered in the inflationary early 
Universe. 

Relaxation and thermalization generally result in loss of information 
about the details of the initial state of the system. However, the unitary 
quantum evolution of isolated systems preempts any such loss on a 
fundamental level. One way to resolve this contradiction reasons that 
the complexity of the many-body states involved and their dynamics 
lead to an insensitivity to the initial state for any realistic observable!”. 
Consequently, at late times, the system can be characterized by only a 
few conserved quantities. 

Another path to loss of details about the underlying, microscale 
physics is through universality, such as critical scaling of correlations 
near phase transitions’’. Aspects of universality in non-equilibrium 
systems have been discussed in many contexts, such as turbulence!®, 
driven dissipative systems’”!®, defect formation when crossing a phase 
transitions'®*!, and the phenomenon of coarsening” and ageing”>. 
Little is known about whether and how unitary time evolution from a 
general far-from-equilibrium state is connected to universality. 

It has recently been proposed that isolated systems far from equi- 
librium can exhibit universal scaling in time and space associated 
with non-thermal fixed points**”'°. There is growing theoretical 
evidence for non-thermal universality classes, even away from any 
phase transition, that encompass relativistic and non-relativistic 
systems’. In contrast to equilibrium critical phenomena, these 
non-equilibrium attractor solutions do not require any fine tuning 
of parameters. Moreover, the non-thermal scaling solutions do not 


describe a time-translation-invariant state, whereas, for example, the 
scaling around a thermalized!° or pre-thermalized state does”**6, 

Here we study the dynamics of a repulsively interacting Bose gas after 
a strong cooling quench and identify a time window during which the 
system exhibits universal behaviour far from equilibrium. We start our 
experiment with a thermal gas of ultracold 8’Rb atoms in an extremely 
elongated, quasi-one-dimensional (in the z direction) harmonic trap 
(transverse confinement w, =2 x 10*s~!, longitudinal confinement 
w= 30 s~') just above the critical temperature. In the final cooling step, 
the trap depth is lowered rapidly compared to the longitudinal thermal- 
ization timescale (Fig. 1a). This leads to fast removal of high-energy 
atoms, predominantly in the radially excited states, and hence consti- 
tutes an almost instantaneous cooling quench of the system. At the end 
of the cooling ramp, the trap depth lies below the first radially excited 
energy level and only longitudinal excitations remain. After a short 
holding period of 1 ms, which allows the atoms with large transverse 
energies to leave, we rapidly increase the trap depth. In this way, we pre- 
pare an isolated, far-from-equilibrium, one-dimensional system. The 
gas is then left to evolve in the deep potential for variable times t up to 
about 1 s, during which time the universal scaling dynamics takes place. 

We probe the evolution of the system through two sets of meas- 
urements (see Methods for details). First, the in situ density p(z, t) is 
measured using standard absorption imaging”’ after a short time of 
flight of t.¢= 1.5 ms, during which the expansion is predominantly 
along the tightly confined radial direction. Second, the momentum 
distribution n(k, t) of the trapped gas is measured after a long time of 
flight of tior=46 ms using single-atom-resolved fluorescent imaging in 
a thin light sheet”®. For each hold time ¢, the distributions are averaged 
over many independent measurements (Methods). 

A typical time evolution of each of these profiles is shown in 
Fig. 1b. The far-from-equilibrium state at early times exhibits strongly 
broadened density and momentum distributions. At early times, the 
momentum distribution n(k) follows a characteristic exponential decay, 
n(k) x exp(—k&,), for large k. At late times, the system relaxes to thermal 
equilibrium and is well described by a thermal quasi-condensate 
(Fig. 1c, Extended Data Fig. 1; see Methods for details). The momentum 
distribution is then described by only a Lorentzian function, with 
width given by the thermal coherence length Ay = 2h*p(z)/(mkgT), 
where fi is the reduced Planck constant, m is the mass of the atoms, 
kg is the Boltzmann constant and T is the temperature. During the 
evolution, a clear peak emerges at low momenta, signalling the quasi- 
condensation of the system in momentum space. In the following, 
we analyse the thermalization process, providing a link between the 
far-from-equilibrium state at early times and the final equilibrium state 
that is observed. 

For the initial state of the far-from-equilibrium evolution, we find 
n(k) in good agreement with a theoretical model of randomly distrib- 
uted solitonic defects!” (RDM; Fig. 1c). At low momenta, the RDM 
has a Lorentzian shape, n(k) cx [1 + (k/n,)*]~1, with width defined by 
the defect density n,. At high momenta, n(k) exhibits characteristic 
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Fig. 1 | Cooling quench and late-time evolution. a, Schematic of the 
experimental cooling quench. During the quench (blue-shaded region), 
the trap depth V in the radial direction r (upper panel) is ramped linearly 
to its final value within a time T,~7 ms by applying radio-frequency 
radiation at a time-dependent frequency (radio-frequency knife; lower 
panel). The final value of V is below the first radially excited state 
(indicated by the dashed lines), which allows atoms (red dots) at higher 
energies to leave the trap (red arrows). The trap depth is held at its final 
position for approximately 0.5 ms to allow the hot atoms to leave and then 
raised within 1 ms to close the trap (see Methods). The resulting isolated, 
far-from-equilibrium, one-dimensional Bose gas is then measured after 


exponential decay, n(k) « exp(—k€,), determined by the width €, of the 
localized density suppression associated with a solitonic defect. 
Because we probe the system immediately after the almost instanta- 
neous quench, these defects are not equilibrated (Extended Data Fig. 1); 
they have a reduced defect width of €,=0.07 um = &,/3 anda very high 
density of ns=1.4 ym~'. The peak healing length é =h/ /2mg._n 
oe the equilibrium width of a po ease £1 ie Show iL is 
the one-dimensional interaction constant, a, is the s-wave scattering 
length of ®’Rb and ny is the peak density. Although the nucleation of 


a variable time t. b, Time evolution of the density p(z, t) (upper panel; 
colour scale) and of the single-particle momentum distribution n(k, t) 
(lower panel). Each distribution is normalized to the time-dependent atom 
number N(t). c, Initial (upper panel) and final (lower panel) momentum 
distributions n(k). The data for high momenta are binned over seven 
adjacent k values to lower the noise level. Error bars mark the standard 
error of the mean. The solid blue and red lines are theoretical fits using 

the random-soliton model and a thermal quasi-condensate, respectively 
(Extended Data Fig. 1). The vertical dashed lines correspond to the 
momenta of the first radially excited state. 


solitons is predicted by the Kibble-Zurek mechanism”, the almost 
instantaneous quench creates an initial state with a strong overpopu- 
lation of high-energy modes. This very far-from-equilibrium state sets 
the initial conditions for the subsequent thermalization process and 
facilitates the observation of the emerging universal dynamics during 
the relaxation of the system. 

The time evolution of the normalized momentum distribution 
n(k, t)/N(£), where N(t) is the total atom number at time ft, is shown 
in Fig. 2a for the first 75 ms following the quench. The distribution 
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Fig. 2 | Universal scaling dynamics. a, Time evolution of the momentum 
distribution. For better visibility, the data are binned over three adjacent 
points in momentum space, with the time encoded by the colour scale. 
The grey line indicates the reference distribution at tj = 4.7 ms; its width 
depicts the 95% confidence interval at to. The vertical dashed line marks 
the high-momentum cut-off for the scaling region. The arrows indicate the 
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scaling of the distribution in time according to equation (1). b, Momentum 
distribution rescaled according to equation (1). When depicted as a 
rescaled function (t/t)) “n(k, t) of the rescaled variables k = (t/ t))°k, the 
data for all times collapse to a single curve, representing the distribution at 
the reference time fy. The exponents a = 0.09 £0.05 and 3=0.10+0.04 
are determined from the maximum-likelihood function. 
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Fig. 3 | Scaling exponents. a, The combined two-dimensional likelihood 
function (colour scale), averaged over all times t and reference times 

to within the scaling period and over three different initial conditions, 
reveals a clear peak that yields the non-vanishing scaling exponents 

ax G=0.1+0.03, with a deviation between the two exponents of 
Aqs= a — G=—0.01 £0.02. The error is estimated using a Gaussian fit 
(black dashed lines) to the marginal-likelihood function (top and right). 
b, Dependence of the scaling exponents on the reference time fp. The 
exponents are, to a good approximation, independent of fp and agree 
well with the mean predictions (black solid and dashed lines). The error 
bars denote the standard deviation obtained from a Gaussian fit to the 
marginal-likelihood functions at each reference time separately. 


function shifts with time towards lower momentum scales while the 
occupancy grows in the infrared. In general, n(k, t) depends on k and 
t separately. 

However, it has been suggested‘ that overpopulated fields far from 
equilibrium can give rise to universal behaviour, signalled by the infra- 
red scaling property of the distribution function 


n(k,t) =(t/to) “f,((t/to)"k) (1) 
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where fo denotes an arbitrary reference time within the period when 
n(k, t) exhibits the scaling behaviour. 

Figure 2b demonstrates that scaling exponents a and (3 can indeed 
be found such that, in the infrared, the rescaled distributions 
(t/t) °“n(k, t)as functions of the rescaled momenta k= (t/ ty)°k col- 
lapse to a single curve f (k) =n(k, ty). This indicates that below a char- 
acteristic momentum scale kg, the distribution function n(k, ft) depends 
on space and time only through the scaling of a single universal func- 
tion f (k). The scaling exponents are found to be a=0.09 £0.05 and 
G=0.1 40.04, which indicates that a G (see Methods for details on 
the error estimation). 

We demonstrate the predicted insensitivity of the universal prop- 
erties to the initial state by comparing the evolution for different ini- 
tial conditions before and after the cooling quench. We find excellent 
agreement for the scaling exponents, obtained independently by using 
a scaling analysis for each of the three measurements (Extended Data 
Figs. 2-5). This shows the generality and robustness of these non- 
equilibrium attractor solutions: in contrast to equilibrium critical 
phenomena, for which the temperature has to be adjusted to observe 
scaling, no fine-tuning of parameters is required. 

The universal character allows us to relate the predictions for each 
measurement directly, resulting in the combined likelihood func- 
tion presented in Fig. 3a. We consider, for the analysis, the approx- 
imately uncorrelated exponents 3 and A,g=a — (@. In agreement 
with each individual measurement, we find a clearly non-vanishing 
exponent 3=0.1+0.03 and a vanishing (within errors) exponent 
Agg=—0.01 + 0.02, and thus a=0.09 + 0.03. The expected inde- 
pendence of the scaling exponents a and (on the reference time fo is 
shown in Fig. 3b. 

We further demonstrate that the shape of the scaling function f, (k) 
in Fig. 4 is universal: the data for three different initial conditions follow 
a single universal function f, (k) for all times during which the system 
shows scaling dynamics. This reflects an enormous reduction of the 
possible dependence of the dynamics on variations in time and 
momentum, because the scaling function depends on only the relevant 
parameters of the system. For instance, if an initial field amplitude rep- 
resented a relevant parameter, then the scaling function would addi- 
tionally depend on the product of time or momentum and this field 
amplitude, with a new scaling exponent. In this case, extracting the 
scaling function as in Fig. 2 or Fig. 4 as a function of only the product 
of time and momentum would fail to describe the data. 
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Fig. 4 | Universal scaling function and spatially averaged observables. 
a, Normalized universal scaling function for varying initial conditions: (1) 
blue, N= 1,700, ns=1.4 pm}; (2) red, N= 2,800, ns=0.9 jum '; (3) green, 
N=1,150, ns=2.3 pm’. All initial conditions collapse to a single 
universal function fs with exponent ¢= 2.39 + 0.18 (grey solid line) for all 
times within the scaling region. The non-universal scales are the 
characteristic momentum scale, ky = 2.61 jum! (blue), 2.28 ,zm~! (red) 
and 3.97 jm! (green), and the global scaling factor of the momentum 
distribution, mp = 0.14 jum, 0.15 jum and 0.10 tm, respectively. The rescaled 
experimental data are binned over three adjacent points in k for clarity. 
The small deviations at low momenta are due to the finite expansion time 
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of the gas (Methods). The initial single-particle momentum distribution 
n(k) at the end of the quench is depicted in the inset. We note the double 
logarithmic scale, in contrast to Fig. 2. b, Scaling of averaged observables. 
The fraction of particles in the scaling region N x (t/ tp)“ (upper panel) 
becomes approximately conserved (solid black line) within the scaling 
period (grey-shaded region) while being transported towards lower 
momenta. Deviations in the scaling of the mean kinetic energy per particle 
in the scaling region, M, x (t/t,) *” (lower panel), from the predicted 
scaling (solid black line) indicates the extent of the scaling period in time. 
The error bars mark the 95% confidence interval. a.u., arbitrary units. 
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We consider the form f, «x [1+ (k/k)S} for the scaling func- 
tion*”’, where the exponent ¢=2.39 + 0.18 is obtained from a single 
maximum-likelihood fit to all experimental realizations simultane- 
ously. For a fixed exponent, the non-universal scales—the global scaling 
factor of the momentum distribution and the momentum scale kg that 
rescales the dimensionless momentum k /ky—are determined from a 
least-squares fit for each experimental realization (Methods). The shape 
of the momentum distribution within the scaling period is markedly 
different from the thermal distribution (compare Fig. 1c and Extended 
Data Fig. 1), which clearly indicates a non-thermal scaling 
phenomenon. 

The extent of the scaling region in time is visible from the scaling 
behaviour of the spatially averaged observables N and M, (Methods), 
which describe the fraction of particles and the mean energy per par- 
ticle in the time-dependent scaling region of momentum space 
(|k| < (t/to)°ks), respectively. From the scaling ansatz in equation (1), 
we find N «x (t/t )~*# and hence (because A,,g~0) the emergence of 
a conserved quantity. This is confirmed in Fig. 4b, in which N is 
approximately constant in the scaling period, whereas it shows a clear 
time dependence before and after. 

The values for the scaling exponents a and (@ determine the direction 
and speed with which the particles are being transported. Because these 
values are positive, a given momentum k in this regime scales as 
kiko x t7%, so the transport is directed towards lower momenta (the 
infrared). This transport of particle number leads ultimately to the 
observed build-up of the quasi-condensate and the approach to thermal 
equilibrium at late times. The mean energy also exhibits power-law 
behaviour, M, x (t/ Pe , and is in accordance with the determined 
scaling exponent /3. Therefore, whereas the particle number in the scal- 
ing region is conserved, energy is transported outside this region to 
higher momenta. On the basis of the scaling properties of these global 
observables, we identify the scaling period to include the times 
t~0.7-75 ms. 

The far-from-equilibrium universal scaling dynamics in isolated 
Bose gases following a strong cooling quench or for equivalent ini- 
tial conditions has been studied theoretically using non-perturbative 
kinetic equations* . In these studies, the universal scaling function is 
expected to depend on the dimensionality d. The predicted’? power-law 
fall-off n(k) x k~S, with C=d + 1, is consistent with the approximate 
form of the scaling function given by the RDM and by the quasi- 
condensate at low momentum, but differs (slightly) from the experi- 
mental results. A scaling analysis of the kinetic quasiparticle transport* 
yields the exponent @= 1/2 in equation (1) to be independent of d. 
However, this theory is not expected to apply fully. In particular, for 
d=1, owing to the kinematic restrictions from energy and momentum 
conservation, the associated transport is expected to vanish. 

The contributions of higher dimensions to the one-dimensional 
physics provide a plausible way of explaining the non-standard scal- 
ing function and scaling exponents observed. Initially, there is a small 
population of atoms with momenta large enough to excite thermalizing 
collisions’, and a very small initial seed can lead to thermalization, as 
observed previously*'. This is confirmed by a quasi-condensate fit to 
the final momentum distribution, which, assuming thermal equilib- 
rium, yields an excited-state population of 11% (T=95 nK=0.6fw1). 
Our experimental results provide a quantum simulation near the 
dimensional crossover between one- and three-dimensional phys- 
ics, establishing universal scaling dynamics far from equilibrium in 
a regime in which no theoretical predictions are currently available. 

The direct experimental evidence that we have presented of scaling 
dynamics in an isolated far-from-equilibrium system is a crucial step 
towards a description of non-equilibrium evolution by non-thermal 
fixed points. Similar phenomena have recently been observed” in a 
spin-1 system, but with a scaling exponent of 3~ 1/2. The concept of 
non-thermal fixed points has the potential to provide a unified descrip- 
tion of non-equilibrium evolution, reminiscent of the characterization 
of equilibrium critical phenomena in terms of renormalization-group 
fixed points**. Such a description may lead to a comprehensive 
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classification of systems on the basis of their universal properties far 
from equilibrium, which would be relevant for a large variety of systems 
at different scales. 
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METHODS 


Preparation of the gas and cooling quench. The initial thermal Bose gas is prepared 
using a standard procedure to produce ultracold gases of *7Rb on an atom chip*. 
The description of the system at the microscale is given by the Bose Hamiltonian 
with contact interactions, determined by the s-wave scattering length a,=5.2 nm. 
We prepare a thermal cloud of typically N= (2.7-3.2) x 104 atoms initially in an 
elongated, w= 21 x 23 Hz and w, =2n x 3.3 kHz, deep trapping potential 
Vixh x (130-160) kHz at a temperature T~ 530-600 nK. The atoms are held in this 
configuration for 100 ms to ensure a well defined initial state. The thermal cloud 
is above both the dimensional crossover to an effective one-dimensional system 
and the critical temperature T. for the phase transition to a three-dimensional 
Bose-Einstein condensate, and therefore has a large excess of particles in transver- 
sally excited, high-energy states. The trap depth is reduced to its final value Veat a 
constant rate Ry=(V;— V;)/Tq=h x 25 kHz ms" by applying radio-frequency radi- 
ation at a time-dependent frequency (RF-knife), leading to an energy-dependent 
transition of atoms from a trapped to an un-trapped spin state. This allows the 
high-energy particles to rapidly leave the trap, leading to the competing times- 
cales T, of the cooling quench (see Fig. 1) and the typical collision times needed 
for re-equilibration of the system. The final trap depth is Vex h x 2 kHz, which 
is below the first radially excited state of the trapping potential, hV; < hw. At 
the end of the cooling ramp, the RF-knife is held at its final position for 0.5 ms 
before it is faded out within 1 ms, thereby raising the trap depth to Vh x 20 kHz. 
In addition, because the RF-knife reduces the radial trapping frequency slightly, 
there is a small interaction quench (about 10%) of the one-dimensional system. 
The system is therefore rapidly quenched to the quasi-one-dimensional regime, 
finally occupying only the transverse ground state. Experimental realizations 1 to 
3 reported in the main text have final atom numbers of N~ 1,700, 2,800 and 1,150, 
respectively, and agree well with the RDM with a defect density of ns=1.4 pm™!, 
0.9 m7! or 2.3 pm! and defect width of €,=0.07 jum, 0.06 jum or 0.05 1m (cor- 
responding to &/€, = 0.3, 0.3 or 0.17). The resultant far-from-equilibrium state 
is held for variables times of up to t= 1 s, during which the universal dynamics 
develops and takes place. 

Measuring the density and momentum distributions. The density and momentum 
distribution of the gas are measured after finite time of flight for t,.¢= 1.5 ms and 
tror= 46 ms of free expansion. This gives access to the in situ (iS) and time-of- 
flight (tof) density profiles, for which the atoms are detected using absorption and 
fluorescent imaging in a thin light sheet, respectively. We then calculate the radially 
centred and integrated density profiles in the longitudinal direction. We correct 
the profiles for possible random sloshing effects. The quench and measurement 
is repeated for each experimental shot and hold time ft, 10-15 times for the in situ 
data and 25-50 times for the time-of-flight data. The fast expansion in the radial 
direction dilutes the gas and leads to ballistic expansion in the longitudinal direc- 
tion. Because the momentum of the particles during the expansion is therefore 
approximately conserved, the density distribution after expansion converges to 
the in situ momentum distribution of the cloud. We checked the effects of a finite 
dilution time via numerical simulations of the Gross—Pitaevskii equation, using 
hydrodynamic models to determine the time dependence of the interaction con- 
stant g for early times of the expansion. For the parameters of the experiment, we 
did not find any substantial deviations from a completely ballistic expansion in 
the longitudinal direction. 

The pulled-back momentum distribution converges rapidly for high k towards 
the true momentum distribution of the gas. For low k the finite in situ size of 
the cloud does not allow for a clear separation of different momentum modes 
and atoms of different momentum overlap in the measured density after time of 
flight. This means that for a cloud of size R, particles with momentum less than 
about kis = Rm/(fittog) do not have time to propagate sufficiently far outside the in 
situ bulk density to be clearly separated. Therefore, the pulled-back momentum 
distribution for k $ kis resembles the in situ density profile rather than the actual 
momentum distribution of the gas. 

Scaling analysis. We extract the universal scaling exponents a and ( using a least- 
squares fit of the analytical prediction in equation (1), minimizing 
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where we average over all times ¢ and reference times f) within the scaling period. 
The local ras gts to) is calculated via 


‘i [(t/to)"F(t/to)"k, to) — CK, OT? ap 
ky O((t/ty)°k, to)” + O(k, t)? 


2 
Xq,abb t) = 


where o denotes the standard error of the mean, and i(k, t) = n(k, t)/N(t) and 
&(k, t) = o(k, t)/ N(t) are normalized by the total atom number to minimize the 
influence of atom loss during the evolution; however, the atom loss is negligible 


during the time period when the system shows scaling behaviour. For later times 
the atom loss is roughly 10% per 100 ms, with a final atom number of approxi- 
mately 40% at the end of the evolution. The rescaling of the momentum variable 
inevitably necessitates a comparison between the momentum distributions at 
momenta lying between the discrete values measured in the experiment. We there- 
fore linearly interpolate the spectrum and its error at the reference time fp, which 
allows us to evaluate the experimental spectrum at all momenta (t/to)%k. For the 
scaling analysis we symmetrize the spectrum by averaging the momentum distri- 
bution over +k to lower the noise level. 

Estimating the exponents and their error is done via the likelihood function. 
To decouple the two exponents we take a= G + A,gand fit the deviation Ayg 
of the exponent from the theoretical expectation a = (3. We therefore define the 
likelihood function 


L(A, y 8) = exp 


-3x"(Aae a (3) 


The most probable exponents are determined by the maximum of the likelihood 
function. The error of the estimate is determined by integrating the two- 
dimensional likelihood function along one dimension and extracting the variance 
of the remaining exponent using a Gaussian fit to the marginal-likelihood func- 
tions. We find excellent agreement between the marginal likelihood functions and 
the Gaussian fits. Therefore, the Gaussian estimate of the error is equivalent to the 
(asymmetric) estimate using a change in the log-likelihood function by 1/2. The 
reason for this good agreement is the aforementioned decoupling of the exponent, 
which results to a good degree in a two-dimensional, Gaussian likelihood function 
for L( Aq, 3). The estimates of the scaling exponents for different reference times 
are calculated equivalently (neglecting the sum over fy in equation (2)). The esti- 
mate is insensitive to the upper cut-off k, (within reasonable limits). The momen- 
tum ks, which limits the scaling region in the ultraviolet, is determined as the 
characteristic scale for which the mean deviation of the rescaled momentum dis- 
tributions for |k| <ks and averaged over all times ¢ in the scaling period exceeds 
the 95% confidence interval at the reference time fo. The lower cut-off is taken as 
kj =0. Excluding momenta |k| < ks leads to a small shift in the exponents towards 
lower values, but agrees well within the estimated errors of the exponents (less than 
about 0.30 deviation). The results of the scaling analysis for three independent 
experimental realizations are shown in Extended Data Figs. 2-4. We find similar 
results in all cases. The exponents and errors reported in the main text are 
estimated from the combined likelihood function L = |]; L;, where i labels the 
independent experimental realizations. 

The universal function fg is determined equivalently, where for each fixed 
exponent ¢ the non-universal scales are determined from a least-squares fit to 
each experimental realization separately. To minimize the influence of the finite 
expansion of the gas, we consider momenta |k| > kis for the determination of 
fs. The likelihood function is subsequently defined by the averaged residuals of 
the scaled data as compared to the universal scaling function fg= (1 + k‘)~! for 
all realizations simultaneously. The error is again estimated using a Gaussian fit 
to the (one-dimensional) likelihood function. The non-universal scales for the 
most likely exponent ¢= 2.39 + 0.18 for experimental realizations 1 to 3 are the 
characteristic momentum scale, ky =2.61 jum7!, 2.28 m~! and 3.97 m7}, respec- 
tively, and the global scaling factor of the momentum distribution, my = 0.14 |um, 
0.15 pm and 0.10 jm. 

Global observables. We define the global observables 


= n(k, t) -A 
N= dk x (t/t) ~*? 4 
ik <(t/toy Pes NC : o 
— |k|" nk, t) =p 
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where ks=6.5-8 jum! defines the high-momentum cut-off for the scaling region in 
k. In the main text, we consider the fraction of particles in the scaling region 
N x (t/t) and the mean kinetic energy per particle in the scaling region 
M,, ~ (t/to) 2°. The global observables N and M show independent scaling in time 
with the exponents A,,3 and (3, whereas the integral ranges depend non-trivially on 2. 
The results for each experimental realization are shown in Extended Data Fig. 5. In the 
main text, we report the result obtained by averaging over all experimental realizations. 
Model fits. The density profile (z) is determined by a fit to the experimen- 
tal in situ density, measured after t,,.¢= 1.5 ms of free expansion. In case of the 
RDM we consider for a fixed atom number N(t) and a scaled density profile 
p(z, t)= =b-\(t)p(zb-1(#)) in the Thomas—Fermi approximation, leaving the 
scaling factor b(t) as the only free parameter (more precisely, a previous result*® is 
used for the density profile, which takes the radial swelling of the condensate into 
account). We neglect possible finite-temperature fluctuations and any contributions 
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from radially excited states in the RDM, assuming the gas to be dominated by soli- 
tonic defects. For early times, the high-momentum modes do not have substantial 
thermal occupation, and we find good accordance with the RDM. 

In case of the quasi-condensate, we determine the thermal density profile for a 
given temperature T and chemical potential ;. using simulations of the stochastic 
Gross—Pitaevskii equation (see, for example, ref. 36) The broadening of the density 
distribution is herein due to the finite temperature of the gas. The density profile 
is subsequently fitted via p(z, t) = pac(z, T(t), w(t) + pi lz, T(t), w(t)). Here we 
take into account the thermal occupation of radially excited states p, within the 
semiclassical approximation, which are non-negligible for late times. The chemical 
potential u is fixed by the total atom number, f p(z, thdz = N(t). 

The fitted density profiles are used to determine the single-particle momentum 
distribution n(k, t) of the inhomogeneous system from a least-squares fit of the exper- 
imental data to the theoretical predictions within the local density approximation. 
For both models we restrict the fitting region to |k| > kis, owing to the simplified 
hydrodynamic model for the finite expansion of the gas. The RDM” is fitted over the 
full momentum range that is accessible in the experiment. For high defect densities, 
the RDM fit shows correlations between defect density and width because these two 
scales become of the same order for the far-from-equilibrium state. Because it is theo- 
retically expected that the defect width is approximately conserved during evolution, 
we fix the defect width to its mean value within the first 25 ms of evolution, leaving 
the defect density as the only free parameter. We find reasonable agreement between 
the RDM results and the independent scaling analysis. In particular, the RDM is 
clearly preferred compared to a thermal distribution within the scaling period. 
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For the fits in thermal equilibrium we consider a quasi-condensate model*”, 
including thermal occupation of radially excited states**. Considering the validity 
regime of the quasi-condensate model, we restrict the fitting procedure to momen- 
tum modes with energy less than hw. We determine the chemical potential ju by 
fixing the atom number within this region of momentum space. This leads to a 
slight shift in the chemical potential compared to the in situ fits. For late times we 
find excellent agreement with the experimental data, demonstrating the relaxation 
of the system to thermal equilibrium. 


Data availability 
The data that support the findings of this study are available from the correspond- 
ing author on reasonable request. 
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Extended Data Fig. 1 | Results of random-defect and quasi-condensate occur, signalling the end of the scaling region. The quality of the model 
models. The time evolution of the characteristic scales for the fit is depicted in the lower panel (black squares), where positive and 
experimental data presented in Fig. 4a (initial condition 1) are shown. negative values favour the random-defect and quasi-condensate models, 
The resulting temperature T (blue) and defect density n, (red) are shown respectively. The random-defect model is strongly preferred for the first 
in the upper panel for the full time evolution. The defect width for the roughly 100 ms, after which the system converges to a thermal quasi- 
random-defect model is fixed to €, = 0.087 j1m, determined by the mean condensate within about 400 ms. The absolute values of the reduced \? for 
over the first 25 ms of the evolution. The defect density within the scaling the random-defect (RD) model are about 1 and 5 for early and late times, 
region shows a power-law dependence consistent with the exponent ( of respectively; those for the quasi-condensate (QC) model are about 25 and 1. 


the scaling evolution reported in the main text. For later times deviations 
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Extended Data Fig. 2 | Rescaling analysis for different initial 
conditions. a—c, Original (left) and rescaled (right) single-particle 
momentum distribution n(k, t) for different initial conditions 

(a-c correspond to initial conditions 1-3 in Fig. 4a). Each distribution 
is normalized by the time-dependent atom number N(t) and the time 
is encoded in the colour scale. The grey dashed vertical lines indicate 
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the scaling regime in k. The scaling exponents a ~ () and the deviation 
between them A,j3=a — (are in excellent agreement with the mean 
values reported in the main text. We note that here we compare the data 
for the full experimental resolution in k. The distribution at the reference 
time to = 4.7 ms is given by the grey line; its width indicates the 95% 
confidence interval. 
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Extended Data Fig. 3 | Likelihood function for different initial between the two exponents is Ayg3=a — (= 0. For scan 2 (b), a small 
conditions. a—c, Two-dimensional likelihood functions (colour scales) condensate may have been present before the quench, which led to the 
and marginal-likelihood functions (top and right) for different initial larger extent of the likelihood function. Gaussian fits are in excellent 


conditions (a-c correspond to initial conditions 1-3 in Fig. 4a). A clear agreement with the marginal-likelihood functions and determine the error 
peak at non-zero a ~ (3 is visible for each realization, whereas the deviation _ of the scaling exponents reported in Extended Data Fig. 2. 
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Extended Data Fig. 4 | Time evolution of scaling exponents for determined from the likelihood function for each reference time fo, are in 
different initial conditions. a—c, Scaling exponents a ~ ( (blue) and good agreement with the predicted mean (black solid and dashed lines). 
deviation between the two exponents A,g=a — (3 (red) for different The error bars denote the standard deviation obtained from a Gaussian fit 
initial conditions (a—c correspond to initial conditions 1-3 in Fig. 4a), to the marginal-likelihood function at each reference time separately. 
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Extended Data Fig. 5 | Spatially averaged observables for different 
initial conditions. a—c, Time evolution of the fraction of particles in the 
scaling region N x (t/t,)“9 (red) and the mean kinetic energy per particle 
in the scaling region M, x (t/t)) *” (blue) for different initial conditions 
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(a-c correspond to initial conditions 1-3 in Fig. 4a). Within the scaling 
region (grey-shaded areas), N is approximately conserved. The solid black 
lines are the approximately conserved value and scaling solutions (5). The 
error bars indicate the 95% confidence interval. 
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In recent years, artificial neural networks have become the flagship 
algorithm of artificial intelligence!. In these systems, neuron 
activation functions are static, and computing is achieved through 
standard arithmetic operations. By contrast, a prominent branch 
of neuroinspired computing embraces the dynamical nature of the 
brain and proposes to endow each component of a neural network 
with dynamical functionality, such as oscillations, and to rely on 
emergent physical phenomena, such as synchronization“, for 
solving complex problems with small networks’~!'. This approach 
is especially interesting for hardware implementations, because 
emerging nanoelectronic devices can provide compact and energy- 
efficient nonlinear auto-oscillators that mimic the periodic spiking 
activity of biological neurons!*-'°. The dynamical couplings between 
oscillators can then be used to mediate the synaptic communication 
between the artificial neurons. One challenge for using nanodevices 
in this way is to achieve learning, which requires fine control and 
tuning of their coupled oscillations!’; the dynamical features of 
nanodevices can be difficult to control and prone to noise and 
variability!®. Here we show that the outstanding tunability of 
spintronic nano-oscillators—that is, the possibility of accurately 
controlling their frequency across a wide range, through electrical 
current and magnetic field—can be used to address this challenge. 
We successfully train a hardware network of four spin-torque nano- 
oscillators to recognize spoken vowels by tuning their frequencies 
according to an automatic real-time learning rule. We show that the 
high experimental recognition rates stem from the ability of these 
oscillators to synchronize. Our results demonstrate that non-trivial 
pattern classification tasks can be achieved with small hardware 
neural networks by endowing them with nonlinear dynamical 
features such as oscillations and synchronization. 

Spin-torque nano-oscillators are natural candidates for build- 
ing hardware neural networks made of coupled nanoscale oscilla- 
tors$101315.18:19 These nanoscale magnetic tunnel junctions emit 
microwave voltages when they are driven by direct-current injection 
in a regime of sustained magnetization precession through the effect 
of spin torque. In addition, they have exceptional capacities to syn- 
chronize their rhythms to periodic electric and magnetic input signals 
and to other spin-torque nano-oscillators””-™*, This property originates 
from the high tunability of their frequency, in other words, the large 
frequency changes induced by applied d.c. currents and magnetic fields. 
Single spin-torque nano-oscillators can achieve impressive cognitive 
computations”®. However, it has not been shown experimentally that a 
coupled network of spin-torque nano- oscillators can learn to perform 
computational tasks through synchronization. Here, we use the ability 
of spin-torque nano-oscillators to modify their frequency in response 
to injected direct currents to train in real-time a network of coupled 
oscillators to categorize different input patterns into different synchro- 
nization configurations”!”!8. 


We transpose to hardware the neural network illustrated in Fig. la!” 
with the set-up illustrated in Fig. 1b. The four neurons in Fig. 1a are 
experimentally implemented with four spin-torque nano-oscillators 
(Fig. 1b), in our case circular magnetic tunnel junctions with 
375 nm diameter and an FeB free layer with a vortex as ground state 
(see Methods)*°. The double arrow connections between neurons 
(blue in Fig. 1a) indicate that the output of neuron i influences the 
behaviour of neuron j, and vice versa. We implement these symmetric 
neural interconnections by connecting electrically the four oscillators 
using millimetre-long wires as schematized in Fig. 1b: in this configu- 
ration, the microwave current generated by each oscillator propagates 
in the electrical microwave loop and in turn influences the dynam- 
ics, and in particular the frequency, of the other oscillators through 
the microwave spin-torques it creates”*. The sum of all microwave 
emissions is detected by a spectrum analyser. Importantly, we can 
control the frequency of each oscillator by adjusting the direct cur- 
rent flowing through each (see Methods and Extended Data Fig. 1). 
Here, for computing, we choose direct currents leading to close but 
not identical frequencies. The light blue curve in Fig. 1c shows a four- 
peak spectrum typical of this regime of moderate coupling where the 
dynamics of the oscillators are correlated but do not lead to mutual 
synchronization. 

The inputs to the neural network are encoded in the frequencies f, 
and fp of two fixed-amplitude microwave signals. Injected in a strip line 
fabricated above the active magnetic layers, they modify the dynamics 
of the oscillators through the radiofrequency magnetic fields they gen- 
erate. Figure 1d shows that when the frequency of one of the microwave 
sources is swept, each oscillator synchronizes to the source in turn. 
Indeed, when the frequency of the source gets close to the frequency 
of one of the oscillators, the strong signal of the source pulls the adapt- 
able frequency of the oscillator towards its own. In the locking range, 
the frequency of the oscillator becomes equal to the frequency of the 
source’’. The dark blue curve in Fig. 1c shows an example of spectrum 
measured when the two microwave inputs are injected simultaneously. 
Two peaks (in red) appear at frequencies f, and fg owing to capacitive 
coupling with the strip line. In comparison to the spectrum without 
inputs (light blue curve), the emission peaks of oscillators 1 and 2 are 
pulled towards f,, whereas oscillator 4 is phase-locked to input B (its 
emission peak merges with the one of input B at fg). We label this syn- 
chronization configuration as (4B). 

The possible outputs of the neural network, represented in different 
colours in Fig. le, are the different synchronization configurations that 
appear for different frequencies of the two input signals, keeping the 
direct currents through the oscillators fixed. Depending on the fre- 
quencies of inputs, zero (grey regions), one, or two oscillators are phase- 
locked. For example, in the petrol-blue region labelled (2A), oscillator 
2 is synchronized to input A. In the white region labelled (1A,3B), 
oscillators 1 and 3 are synchronized to inputs A and B, respectively. 
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Fig. 1 | Approach for pattern classification with coupled spin-torque 
nano-oscillators. a, Schematic of the emulated neural network. 
b, Schematic of the experimental set-up with four spin-torque nano- 
oscillators electrically connected in series and coupled through their own 
emitted microwave currents. Two microwave signals encoding information 
in their frequencies f, and fg are applied as inputs to the system through a 
strip line, which translates into two microwave fields. The total microwave 
output of the oscillator network is recorded with a spectrum analyser. 
c, Microwave output emitted by the network of four oscillators without 
(light blue) and with (dark blue) the two microwave signals applied to 
the system. The two curves have been shifted vertically for clarity. The 
four peaks in the light blue curve correspond to the emissions of the four 


We now describe how this neural network can recognize patterns 
by classifying spoken vowels, which are naturally characterized by 
frequencies called formants”®. We use as input data a subset of the 
Hillenbrand database (available at https://homepages.wmich.edu/~hil- 
lenbr/voweldata.html; see Supplementary Information) comprising 
seven vowels pronounced by 37 different female speakers, where each 
vowel is characterized by 12 different frequencies. Formant frequencies 
are typically in the range between 500 Hz and 3,500 Hz, so a trans- 
formation is needed to obtain input frequencies (f,, fg) in the range 
of operation of our oscillators, between 325 MHz and 380 MHz. As 
detailed in Methods, we obtain f, and fg through two different lin- 
ear combinations of the 12 formant frequencies that fit the grid-like 
geometry of the oscillator synchronization maps. In the resulting map 
shown in Fig. 1f, each point corresponds to one speaker. The spread 
in frequency for each vowel indicates that each speaker has a different 
pronunciation. Our goal is to recognize the vowel presented as input to 
the oscillator network independently of the speaker. For this purpose, 
the scattered points corresponding to each vowel pronounced by dif- 
ferent speakers should all be contained inside a different region of the 
oscillator synchronization map in Fig. le. 

As can be seen from Fig. 2a, in which the input vowel map and the 
oscillator synchronization map are superposed, initially they do not 
coincide: the initial oscillator frequencies have been set randomly and 
are not adequate to solve the problem. The oscillatory neural network 
must learn to perform the classification properly. During this training 
stage, the internal parameters of the network need to be finely tuned 
until each synchronization region encompasses the cloud of points 
corresponding to the vowel that it has been assigned. For this pur- 
pose, we take advantage of the highly tunable nature of spin-torque 
nano-oscillators to modify the synchronization map by tuning the 
direct current through each oscillator, adapting a training algorithm 
first proposed in ref. !”. We have developed an automatic real-time 
learning procedure involving a feedback loop between the experimental 
setup and the computer that controls it (see Methods). At each training 
step, we consecutively apply seven inputs (fa, fg) to the oscillators, one 
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oscillators. The two narrow red peaks in the dark blue curve correspond to 
the external microwave signals with frequencies f, and fg. d, Evolution of 
the four oscillator frequencies when the frequency of external source A is 
swept. One after the other, the oscillators phase-lock to the external input 
when the frequency of the source approaches their natural frequency. In 
the locking range, the oscillator frequency is equal to the input frequency. 
e, Experimental synchronization map as a function of the frequencies 
of the external signals f, and fg. Each colour corresponds to a different 
synchronization state. f, Inputs applied to the system, represented in the 
(fa, fg) plane. Each colour corresponds to a different spoken vowel, and 
each data point corresponds to a different speaker. 


for each vowel, randomly picked between the different speakers. The 
oscillator emissions corresponding to each of the seven input micro- 
wave signals are recorded with a spectrum analyser. A computer iden- 
tifies the corresponding synchronization states (see Methods). If all the 
seven vowels have been correctly classified in their assigned synchroni- 
zation regions of the map (fa, fg), the direct currents are not changed. If 
one or several vowels have not been correctly classified, direct currents 
in the oscillators are modified to bring the assigned synchronization 
regions closer to the corresponding input frequency pairs (fa, fg) and 
thus reduce the classification error (see Methods). In the next learning 
step, another set of seven vowels is applied, and so on. 

Figure 2 shows synchronization maps obtained at different stages 
of the training process (Fig. 2a—d), together with the evolution of the 
direct currents applied to the oscillators (Fig. 2e), their frequencies 
(Fig. 2f) and the average recognition rates for the seven vowels (Fig. 2g) 
(for a short video (20 s), see Supplementary Information or https:// 
youtu.be/bbRqqcxc-po; for a longer video (3 min 30 s), see https:// 
youtu.be/IH Ynh0oJgOA). After 48 training steps, an optimum is found, 
direct currents and frequencies stop evolving, and the recognition rates 
stop increasing, signifying that the training process can be stopped. 
During training, we do not use all the vowels in the database. We always 
retain 20% of the vowels to test the ability of the system to recognize 
unknown data. The final recognition rates on the training and testing 
datasets reach values up to 89% and 88%, respectively (Fig. 2g). 

We now interpret these experimental recognition rates by compar- 
ing them to the performances that can be achieved with ideal oscil- 
lators trained on the same task with the same learning process. For 
this purpose, we model the oscillator dynamics with coupled van der 
Pol equations accounting for their collective magnetization coor- 
dinates (see Supplementary Information)”. The simulated oscilla- 
tors are noiseless and differ only by a 2% mismatch in their natural 
frequencies, analogous to the one observed experimentally. We first 
vary their ability to synchronize by modifying their frequency tuna- 
bility (see Supplementary Information). Black circles in Fig. 3a show 
the recognition rate of the ideal simulated network as a function of the 
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Fig. 2 | Learning to classify patterns by tuning the frequencies of 
oscillators. a-d, Experimental synchronization map as a function of 
the frequencies of the external signals, at different steps of the training 
procedure: a, step 0; b, step 7; c, step 15; and d, step 86. The coloured dots 
represent the inputs applied to the oscillatory network: vowels pronounced 
by different speakers. Different vowels are shown in different colours. 


average locking range of the oscillators normalized by their frequency 
difference. The recognition rate increases linearly with the oscillator 
locking ranges (see dotted blue linear fit in Fig. 3a). Indeed, as shown 
in the simulated maps of Fig. 3b, when the oscillator locking ranges 
increase, the regions of synchronization grow, thus encompassing and 
classifying an increasing number of points in each of the different vowel 
clouds. As shown in Fig. 3c, d, the mutual coupling between oscillators 
also enhances their locking ranges”’, leading to increased recognition 
rates when the mutual interactions increase. The red star in Fig. 3a 
pinpoints where the experimental result features in this graph. The 
experimental vowel recognition rate of 89% is close to the maximum 
recognition rate of 94% that can be achieved with the same neural 


Number of training steps 


Number of training steps 


A video is provided as Supplementary Information. e, Direct current 
applied through each oscillator as a function of the number of training 
steps. f, Frequency of each oscillator as a function of the number of 
training steps. g, Recognition rates obtained with the sets of data points 
used for training and for testing, as a function of the number of training 
steps. 


network composed of ideal, noiseless oscillators. This high perfor- 
mance is due to the large experimental locking ranges resulting from 
the high tunability, coupling and low noise of the hardware spin-torque 
nano-oscillators. 

We then compare the dynamical oscillator-based neural network 
studied in this paper to more conventional forms of neural networks. 
For this purpose, we first extract a reference value for the experimen- 
tal recognition rate by repeating the training procedure experimen- 
tally several times with different combinations of training and testing 
sets (see Methods). This cross-validation technique yields an average 
value of 84.3% for the experimental recognition rate on the testing set 
that we can compare to other neural networks performances. First, we 
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Fig. 3 | Comparing the recognition rates of experimental and 

ideal oscillators. Simulations of vowel recognition with a network 

of four identical oscillators trained with the same procedure as in the 
experiments are illustrated, in the absence of noise. The simulated 
oscillators differ only by a 2% mismatch in their natural frequencies. 

a, Recognition rate on the training set (black circles) as a function of the 
average oscillator locking range normalized by the frequency difference 
between oscillators (LR/FD). The locking range is varied by modifying 
the tunability of the oscillator frequency. The blue dotted line is a linear 
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fit to the simulation results. The red star indicates where experimental 
oscillators feature in this graph. b, Synchronization maps simulated 

with the network of oscillators used in a, for three different values of 

the normalized locking range. c, Recognition rate on the training set 
(black circles) as a function of the mutual coupling between oscillators 
normalized by their coupling to the microwave inputs. The blue dotted line 
is a linear fit to the simulation results. d, Synchronization maps simulated 
with the network of oscillators used in c, for three different values of the 
normalized coupling e. 
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Fig. 4 | Benchmarking performances with classical neural networks. 
a, Flow chart of the simulated multilayer perceptron. The trained 
parameters are indicated in red. b, Recognition rate obtained through 
cross-validation versus the total number of trained parameters for the 
neural network in a, in which the number of hidden neurons is varied. 


consider a conventional, static, multi-layer neural network. This kind 
of network can achieve better-than-human recognition rates at complex 
tasks, such as image classification. This performance, however, comes at 
the expense of the large number of parameters that need to be trained, 
a major hurdle for hardware implementation. Figure 4b shows the rec- 
ognition rate of a multilayer perceptron, trained in software through 
backpropagation on the same database as the experimental neural net- 
work, with 30,000 vowel presentations (see Methods). As illustrated in 
Fig. 4a, this network, composed of static neurons, takes as inputs the 
12 formant frequencies characterizing each pronounced vowel. The 
hidden layer neurons receive a weighted sum of these inputs (plus a bias 
term). The output layer, with softmax activation functions, has seven 
neurons, one for each vowel class (see Methods). As can be seen in 
Fig. 4b, the recognition rate is excellent, reaching 97% when the num- 
ber of trained parameters is large (synaptic weights illustrated in red in 
Fig. 4a). However, the performance rapidly degrades for small numbers 
of trained parameters, diving below 65% for 27 trained parameters. This 
result is quite general: as can be seen from Extended Data Fig. 2, state- 
of-the-art networks with feedback such as standard recurrent neural 
networks or long short-term memory networks have limited perfor- 
mance when the number of trained parameters is small. In contrast, 
the recognition rate of our experimental oscillatory neural network is 
over 84% for only 30 trained parameters: as illustrated in red in Fig. 4c, 
the 26 weights converting formants to inputs, and the currents through 
the oscillators. For an ideal, noiseless, oscillatory network, the success 
rate reaches 89% after cross-validation. The network also learns rap- 
idly (350 vowel presentations are used). This high performance with a 
small number of trained parameters comes from the combination of 
two phenomena: as shown in Fig. 3c, the oscillatory network can do 
better than the sum of its individual components, owing to its complex, 
coupled, dynamical features, and in addition, the oscillators collectively 
contribute to pattern recognition by synchronizing to the inputs. This 
result shows that the performance of hardware neural networks can be 
boosted by enhancing neuron functionalities beyond simple nonlinear 
activation functions, through oscillations and synchronization. 

In the future, such dynamical neural networks will have to be scaled up 
to solve challenging classification problems on software-benchmarked 
databases. Spin-torque nano-oscillators offer numerous advantages 
towards this goal. Their energy consumption is comparable to or 
lower than complementary metal-oxide-semiconductor (CMOS) 
oscillators, and contrary to the latter, their lateral dimensions can be 
scaled down to a few nanometres in diameter (a detailed comparison 
is presented in Extended Data Table 2). Their quality factor can exceed 
several thousands”®, and their natural frequency can be controlled by 
the aspect ratio of the magnetic dot from hundreds of megahertz to 
several gigahertz in small pillars, opening the path to nano-oscillators 
assemblies with a wide range of natural frequencies!®. In addition, their 
simple structure is similar to spin-torque magnetic random access 
memory cells, which means that they can be produced by billions 


The red star corresponds to the experimental results with the network 
of spin-torque nano-oscillators. Exp., experimental. c, Flow chart of the 
experimental oscillatory neural network. The trained parameters are 
indicated in red. 


on top of CMOS. Finally, their synchronization can be detected with 
CMOS circuits that count the number of oscillations”’ or measure the 
additional d.c. voltages produced by the oscillators when they phase- 
lock (see Methods and Extended Data Fig. 3)*°. Therefore, the wide 
variety of possible magnetic and electric couplings offered by spin- 
tronics’’4, and the different ways of driving and controlling mag- 
netization dynamics (spin torques, spin-orbit torques, electric fields), 
could be exploited in the future to implement large-scale hardware 
neural networks’>. 


Online content 

Any methods, additional references, Nature Research reporting summaries, source 
data, statements of data availability and associated accession codes are available at 
https://doi.org/10.1038/s41586-018-0632-y. 
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METHODS 
Samples. Magnetic tunnel junction (MTYJ) films with a stacking structure 
of buffer/PtMn(15)/Co7;Fe29(2.5)/Ru(0.9)/CogoFe20B29(1.6)/Co79Fe39(0.8)/MgO(1)/ 
FegoB29(6)/MgO(1)/Ta(8)/Ru(7) (thicknesses in nm) were prepared by ultrahigh- 
vacuum (UHV) magnetron sputtering. After annealing at 360°C for 1 h, the resistance- 
area product was RA ~ 3.6 Q um’. Circular-shaped MT]s with a diameter of about 
375 nm were patterned using Ar ion etching and e-beam lithography. The resistance 
of the samples is close to 40 ©, and the magneto-resistance ratio is about 100% at 
room temperature. The FeB layer presents a structure with a single magnetic vortex 
as the ground state for the dimensions used here. In a small region called the vortex 
core (of about 12 nm diameter at remanence for our materials), the magnetization 
spirals out of plane. Under direct current injection and the action of the spin transfer 
torques, the core of the vortex steadily gyrates around the centre of the dot with a 
frequency in the range of 150 MHz to 450 MHz for the oscillators we used here. 
Database and inputs. In this study, we classify seven spoken vowels with the oscilla- 
tory network. Spoken vowels are characterized by a set of frequencies called formants, 
which we obtain from a subset of the Hillenbrand database (https://homepages. 
wmich.edu/~hillenbr/voweldata.html) given in Supplementary Information. We 
use the first three formants (F), F, and F3) sampled at four different times of the 
duration of the spoken vowel: at the steady state and at 20%, 50% and 80% of the 
vowel duration (that is, 12 parameters in total). When one of these 12 parameters 
could not be measured, or when irresolvable formants mergers occurred, Hillenbrand 
et al.” put a zero in this parameter in the database. For our study, we have removed 
the vowel utterances whose corresponding set of formants is not complete. Moreover, 
we use the same number of speakers for each vowel. The resulting formant database 
comprising 37 female speakers that we used is provided as Supplementary Data. 
We perform two linear combinations of these formants to obtain two characteristic 
frequencies (f, and fg) in the range of operation of the spin-torque nano-oscillators 
(between 325 MHz and 380 MHz for the applied field value that we are using): 
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To choose the coefficients of the two linear combinations, we first record an experi- 
mental synchronization map that is used as a calibration of the network. The calibra- 
tion map allows to assign a synchronization pattern to each vowel. Then, the linear 
transformation of the formants that best matches the data points of each vowel with 
its associated synchronization pattern is determined through fitting by least-square 
regression. The coefficients used in the two linear combinations and the two fre- 
quencies f, and fg corresponding to each vowel are provided as Supplementary Data. 
Once this calibration is done and the coefficients and characteristic frequencies 
are calculated, the direct currents are reset to random values to begin the learning 
experiment. Two fixed-amplitude microwave signals with frequencies f, and fg are 
used as inputs to the experimental network of coupled nano-oscillators. 
Experimental set-up. Extended Data Fig. 1 shows a schematic of the experimental 
set-up with the four coupled vortex nano-oscillators. A magnetic field of s9H=530 
mT is applied perpendicularly to the oscillator layers to get an efficient spin transfer 
torque acting on the oscillator vortex core. A direct current is injected into each 
oscillator to induce vortex dynamics, which leads to periodic oscillations of the 
magnetoresistance, giving rise to an oscillating voltage at the same frequency than 
the vortex core dynamics. The four oscillators are electrically connected in series by 
millimetre-long wires. They are therefore coupled through the microwave currents 
they emit, and too far away to be coupled through the magnetic dipolar fields that 
they radiate. Four direct currents (Ipci, Inca; Inc3, Inca) are supplied to the circuit 
by four different sources, allowing an independent control of the current flow- 
ing through each oscillator. The actual current flowing through each spin-torque 
oscillator is given by Isto: =Ipc1 Isto2=Ipc2 + Ipcp Istos= Ines + Inc2 + Ipc 
and Istos=Ipca + Ipc3 + Ince. + Ipcu, respectively, where Igro; corresponds to the 
current flowing through the ith oscillator. Two microwave sources are used to inject 
two external microwave signals with frequencies f4 and fg and power P= —9 dBm 
through a strip line, creating two microwave fields as inputs to the oscillator net- 
work. The amplitude of the generated magnetic field, set by Ampere’s law, depends 
only on the cross-section of the antenna (in addition to the distance between the 
strip line and the active magnetic layer of the oscillators). Therefore, the length of 
the antenna is only set by the number of oscillators it should cover. In our case, the 
strip line has a width of 2.5 jum and is fabricated 370 nm above the pillar (separated 
by an insulating layer). The resulting input microwave fields have an amplitude of 
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0.1 mT. They strongly affect the magnetization dynamics of the four oscillators, and 
thus the total microwave output emitted by the network. The microwave emissions 
are recorded with a spectrum analyser. As can be seen in Fig. 1d, the input signals 
from the antenna can be detected in addition to the oscillator emissions due to 
capacitive coupling between the strip line antenna and the metallic electrodes con- 
necting the oscillator. The analysis of the output, which depends on the frequencies 
of the microwave inputs, can therefore easily be used to classify the spoken vowels. 

Each spectrum recorded with the spectrum analyser is sent to the computer, where 
it is analysed by a program in real time. The information we use as input to this 
program is: (1) the value of the two frequencies of the external microwave signals 
(fa, fg) and (2) the oscillator frequencies at each direct current value in the absence 
of external microwave signals ( tee of. i . 7. af, °) The output data that we extract from 
each spectrum analysis are the four values of the oscillator frequencies in the presence 
of microwave inputs. Then, another program takes these oscillator frequencies to 
calculate the synchronization states and check whether the applied vowel was prop- 
erly recognized, as follows. If one of the detected frequencies coincides with the 
frequency of one of the external signals (0.5 MHz), we consider that the oscillator 
is synchronized to it. From this analysis, the synchronization pattern that corresponds 
to the input vowel is calculated. This is compared to the synchronization pattern 
initially assigned to that specific vowel to check whether it was successfully classified. 

If we are in the training procedure and the vowel is not properly classified, the 
online learning algorithm calculates how the four direct currents should be modi- 
fied to reduce the recognition error, as described in ‘Real-time learning algorithm’ 
below. This information is then sent back to the experimental set-up, where the 
currents are automatically modified. 

Real-time learning algorithm. In this section, we present the supervised learning pro- 
cedure that was applied to our spin-torque nano-oscillator network to learn to recognize 
different classes of input stimuli. Here these classes correspond to seven different spoken 
English vowels: ae, ah, aw, er, ih, iy and uw (see ref. 28 for details; the sounds can be 
heard at https://homepages.wmich.edu/~hillenbr/voweldata html). Initially, we assign 
a synchronization pattern to each class of vowel (column 2 in Extended Data Table 1). 

For a perfect recognition of one class of vowel, all data points in the frequency 
input map that corresponds to this vowel (Fig. 1f) must be contained in their 
assigned synchronization pattern in the experimental map (Fig. le). If this is not 
the case, for each association spoken vowel-synchronization pattern we define a 
frequency difference vector with four components (one for each oscillator; see 
third column in Extended Data Table 1) that will be used in the learning procedure. 

Starting from a random map configuration (Fig. le), the automatic learning rule 
that we developed allows us to converge to a configuration where most data points for 
each vowel class are contained in their respective assigned synchronization pattern. 
The learning rule works in the following way. 

(1) We present to the network a randomly chosen input data point i belonging 
to one vowel class, by sending two microwave inputs with frequencies i. and fi 

(2) From the resulting spectra, we extract the frequencies of the four spin-torque 
oscillators (f,, fi, fs, fa) in presence of the microwave inputs. 

(3) We determine the resulting synchronization configurations by comparing 
the oscillator frequencies to the input frequencies is and f’. Then, we compare 
the obtained synchronization configuration with the one assigned to this vowel. 

(4) For each vowel presented to the network, we define an associated frequency 
difference vector, which describes the frequency distance between the applied input 
and the assigned synchronization region. For instance, if the presented data point 
belongs to the vowel class ‘ae, we compute d,, = fy —f,),0, (fs —f,)s oj. 

If one of the two synchronization events assigned to ‘ae’ has occurred, we only 
compute the frequency difference that corresponds to the other event. For instance, 
if oscillator 1 is correctly synchronized to external source ce then we compute 
only d,, = [0, 0, (f, —f,), OF. 

(5) We repeat steps ( 1) to (4) for all seven vowel classes. 

(6) We compute the sign of the vector sum of all seven associated frequency differ- 
ence vectors D: D=sgn(dye + dah + daw + der + din + diy + duw) = (Dj, Do, Ds, D,)™ 

(7) We then compute the new direct current set ([,’, I’, I’, 1,’ ys which will be 
applied to the four oscillators: 


D,sgn “| 
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In this equation, j= 0.1 mA is the learning rate of our algorithm. At each step, 
the applied direct current through each oscillator can be modified only by +ju. 
Here sgn[(0f, /O1);_;,] represents the sign of the frequency evolution versus 
injected direct current of the kth oscillator at the value of current J;. For this, the 
frequency-current dependence of each independent oscillator has been previously 
characterized. 

Upon modifying the direct currents following this learning procedure, the oscil- 
lator frequencies change. This translates into a displacement of the synchronization 
patterns in the experimental synchronization map (Fig. 2a—d). 

(8) We repeat all previous steps (steps (1) to (7)) N times, where N is the total 
number of training steps. At each iteration, the synchronization map evolves 
towards an optimal configuration where the global frequency difference vector 
tot = dae + dan + day + der + din + diy + duy is minimized. On increasing the 
number of training steps, we observe an increase of the recognition rate until it 
saturates after step 48, reaching a value of 89% (Fig. 2f). In our training experiment, 
we set the maximum number of training steps to N= 87, which corresponds to 
applying three times each of the 29 data points of the training database. 
Cross-validation procedure. Training was realized using 80% of the total number 
of vowels in the database. The testing procedure was done using the remaining 
20% data points. The cross-validation technique allows estimating accurately the 
recognition performances of the network by repeating the training/testing proce- 
dure five times over distinct data point samples. Each time, the selected data points 
used for testing are different: in the first (respectively second, third, fourth and 
fifth) cross-validation period, we use the first (respectively second, third, fourth 
and fifth) quintile (20%) of the data points for testing. The final recognition rate 
was obtained by averaging the testing recognition rates of the five cross-validation 
experiments. The same cross-validation procedure is used for all the neural net- 
works (experimental and simulated). 

Comparison of spin-torque nano-oscillators to CMOS oscillators. Extended 
Data Table 2 compares features of CMOS and spin-torque nano-oscillators. “Vortex 
spin-torque oscillators’ refer to the magnetic tunnel junctions used in this study; 
‘10 nm spin-torque oscillators’ refer to state-of-the-art magnetic tunnel junctions 
currently used as memory cells. 

Comparison with a multilayer perceptron. To benchmark the results of the 
experimental oscillatory network, we first ran a standard multi-layer perceptron, 
schematized in Fig. 4a, on the same vowel database. 

The network takes as inputs the 12 formants of a given vowel in a database and 
has seven outputs, one for each vowel class. We have varied the number of hidden 
neurons between 1 and 20 to evaluate the recognition rate as a function of the 
number of trained parameters. More precisely, each formant has been rescaled 
between —1 and 1 before being fed into the first layer of neurons. The neuron 
activation functions are tanh functions at the hidden layer, and softmax at the 
output layer: the outputs z; (i= 1 to 7) are defined as z; = e7'/ ie 1 es where y; is 
the input to the output neuron j. The output with the largest z; is taken as the vowel 
class corresponding to the input. We also tried ReLU activation functions, but they 
performed worse than tanh on this task. 

For training the network we performed backpropagation, that is, gradient 
descent over the negative log-likelihood (or cross entropy). 

As in the experimental conditions, the samples are picked and presented ran- 
domly to the network. One learning iteration corresponds to one forward pass 
of a given sample through the network, its subsequent gradient evaluation and 
weight update. The learning rate has been tuned to obtain the best result. Weights 
and biases before learning were randomly sampled from a Gaussian of mean 0 
and variance 0.01. 

For each trial, we ran training over 100,000 iterations to ensure convergence 
with a learning rate of 0.05. In practice, optimization techniques such as root- 
mean-square propagation or adaptive moment estimation could be used to accel- 
erate training. All results are reported in Fig. 4b, where we show the recognition 
rate after cross validation as a function of the number of trained parameters. 
Comparison with RNNs. In addition to the multilayer perceptron (Extended Data 
Fig. 2b), we also ran, on the same vowel database, a perceptron (Extended Data 
Fig. 2c), as well as a recurrent neural network (RNN; Extended Data Fig. 2d) and 
along short-term memory network (LSTM) recurrent neural network (Extended 
Data Fig. 2e) with four hidden units. The procedure is similar to the multilayer 
perceptron. Formants are presented sequentially to the network which outputs 
a vowel once all of them have been swept through. Softmax activation functions 
were used at the output layer and tanh elsewhere. Outputs are encoded in a ‘one- 
hot fashion: for example, the ae vowel (out of the seven in total) is encoded by 
(1,0,0,0,0,0,0). We take the maximum activation value as the classification result. 
As in the experimental conditions, the samples are picked and presented randomly 
to the network. One learning iteration corresponds to one forward pass of a given 
sample through the network, its subsequent gradient evaluation and weight update. 
For each architecture, the choice of the learning rate has been tuned to obtain the 
best result. Weights and biases before learning were randomly sampled from a 


Gaussian of mean 0 and variance 0.01. No gradient inertia or learning rate adapta- 
tion technique was used. For the LSTM and the RNN, we ran training over 500,000 
and over 1,000,000 iterations to ensure convergence witha learning rate of 0.01 
and 0.0005, respectively. If needed, optimization techniques such as root-mean- 
square propagation or adaptive moment estimation could be used to accelerate 
training. Owing to the mini-batch size, gradient descent is highly stochastic, and 
we average the test and training rates over the last 5,000 iterations to obtain reliable 
training and error rate for a given trial. All results are reported in Extended Data 
Fig. 2a where we show the cross-validation success as a function of the number 
of parameters learnt. 

Synchronization detection through oscillator rectified voltages. In the present 
work, synchronization of the oscillators is detected using a spectrum analyser, 
allowing a comprehensive understanding of the systems and of the physics of the 
oscillators. In a final integrated system, simpler techniques could be used to detect 
synchronization of oscillators. A possibility is given in ref. *°. Another method, 
involving less energy overhead, consists in exploiting the spin diode effect*!, 
which causes synchronized oscillators to generate a supplementary direct volt- 
age**. Extended Data Fig. 3a and b illustrates this effect in one of our oscillators. 
The appearance of a rectified voltage measured between the oscillator electrodes 
(Extended Data Fig. 3a) coincides with the locking range (Extended Data Fig. 3b). 
The generated rectified voltage is proportional to the fraction of the external micro- 
wave current I. flowing through the oscillator*”. In our experiments, [ext is small: 
the input microwave signals are sent though a strip line isolated from the oscilla- 
tors, in a geometry minimizing by design the capacitive coupling between oscillator 
and strip line (Iext=7.5 X 107 *Istriptine)- As a result, the measured rectified voltages 
are small (approximately 0.5 mV). In the future, these values can be increased up 
to several tens of millivolts by optimizing the coupling between oscillator and strip 
line. Indeed, as demonstrated experimentally, rectification effects due to oscillator 
phase locking can be large, with sensitivities reaching 75.4 mV for the generated 
d.c. voltage per microwatt of injected microwave power”. 

We now present how synchronization detection through the resulting rectified 
voltages may be implemented in a final integrated circuit, using a differential 
method. We propose to use four reference resistors with the same resistance as 
the mean resistance of the nano-oscillators and polarized in the same manner. 
Comparing the voltage across a nano-oscillator and the corresponding reference 
resistance then allows detection of whether the oscillator is experiencing syn- 
chronization (Extended Data Fig. 3c). We designed a simple two-stage CMOS 
circuit to perform this comparison (Extended Data Fig. 3d,e). The first stage 
is composed of two differential amplifiers (voltage to current) in parallel. It is 
followed by a gain stage (current to voltage amplifier). The mismatch between 
the two amplifiers, a standard design technique, allows high gain. The output 
of the circuit is therefore a binary voltage, high if the oscillator is synchronized 
to the input signal, low otherwise. This voltage can be used directly by standard 
CMOS digital circuit to obtain the class of the input. In the circuit, bias voltages 
(Vpiasi and Vyias2) can be adjusted to vary the speed and power consumption of 
the circuit. 

We simulated this circuit in transient operation using the Cadence Spectre 
SPICE simulator, a standard tool in commercial integrated circuit design, with 
the design kit of a 28-nanometre commercial CMOS technology, and optimized the 
bias voltages for minimal energy consumption, while retaining a response time of 
the circuit below 600 ns. Extended Data Fig. 3f shows the energy consumed by the 
detection circuit as a function of the rectified direct voltage due to synchronization, 
taking into account the whole transient of the detection. This energy can be low: it 
is below 200 fJ for rectified direct voltages above 50 mV, which can be achieved in 
structures optimized for spin diode effect*”. For a full system, this detection must 
be performed twice (we send two input signals), for the four oscillators, leading to 
a detection energy of 2 x 4 x 200 fJ = 1.6 pJ. 

Using our current oscillators, this energy would be smaller than the energy 
dissipated by the oscillators and the reference resistors. By contrast, with scaled 
nano-oscillators (see Extended Data Table 2), this 1.6 pJ detection energy would 
become dominant. 

It is interesting to compare this quantity with the energy consumption of a 
purely CMOS neural network, implementing the multilayer perceptron of Fig. 4a. 
Optimized CMOS neural networks compute in reduced precision, usually 8-bit 
integers, which allows low energy consumption*’. Taking into account the arith- 
metic operations (sum and multiplications), in the same commercial 28-nanometre 
technology as the detection circuit that we implemented, we calculated that an 8-bit 
integer neural network implementing the second layer of the neural network of 
Fig. 4a consumes 2.2 pJ. We only took into account the second layer of the neu- 
ral network, as it is the part implemented by the nano-oscillators. To obtain the 
energy estimation, we synthesized a Verilog description of a multiply and accu- 
mulate block and computed its energy consumption with the Cadence encounter 
tools using appropriate value change dump files generated by the Cadence ncsim 
simulator. 
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These energy considerations show that on our tiny control system, a nano- 
oscillator-based solution would provide an energy consumption slightly smaller 
than an optimized CMOS-based solution. We expect that the full benefit of 
the oscillator system will appear in deep networks composed of many layers of 
spin-torque nano-oscillators. Indeed, cascading the synchronization states from 
one layer to the next can be achieved directly through oscillatory interlayer 
coupling and does not require synchronization detection. Only at the last layer 
will detection circuits be required to communicate their state to other circuits. 
Therefore, we expect that in a deep network of oscillators, the energy consumption 
will be largely dominated by the oscillator energy consumption, which can be low 
for a scaled-down oscillator, as can be seen from Extended Data Table 2. 


Data availability 
The datasets generated and analysed during this study are available from the cor- 
responding authors on reasonable request. 
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Extended Data Fig. 1 | Schematic of the experimental set-up. The four coupled vortex nano-oscillators are shown. [gra and Igrp are the microwave 
currents injected in the strip line by the two microwave sources. Hpr is the resulting microwave field. Ipci_4 are the applied direct currents. 
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Extended Data Fig. 2 | Recognition rates obtained by different neural trained parameters. b-e, Schematics of the simulated neural networks: 
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oscillator and its frequency is swept. Here, the direct current through the VDD, supply voltage; GND, ground. f, Energy consumption of the CMOS 
oscillator is 5 mA, the magnetic field is 585 mT and the injected microwave __ circuit for one synchronization detection event, as a function of the 
power is +1 dBm. b, Oscillator spectrum emission measured during amplitude of the generated rectified direct voltages. 

the same frequency sweep as a. c, Proposed differential measurement 
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Extended Data Table 1 | Learning rule 
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Column 1, spoken vowel class; column 2, synchronization pattern assigned to each vowel; column 3, frequency difference vector between the spoken vowels and their associated patterns. 
The index i refers to the ith data point of a vowel class (ith speaker). 
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Extended Data Table 2 | Comparison of CMOS and spin-torque nano-oscillators for neuromorphic computing 


Lateral Energy / Frequency Power Ability to References 
dimensions] _ oscillation consumption | synchronize 


CMOS neuron 10 Hz 2.65 nW Yes 


Scaled CMOS =7um 30 Hz 1.5 nW Yes 
neuron 


Accelerated 1 MHz Yes 
CMOS neuron 


CMOS ring 200 KHz 1.2 nW Unknown 


oscillator 


CMOS ring 1.5 GHz 50 ul Unknown 
oscillator 

CMOS ring » 300 um 16 GHz 23 mW Yes 
oscillator 

Vortex spin- 300 nm 3 pJ 300 MHz 

torque oscillator 

10 nm spin- 10 nm 100 aJ 10 GHz 


torque oscillator 
(projection) 


Data from refs 2434-39, 
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Palladium-mediated enzyme activation suggests 
multiphase initiation of glycogenesis 


Matthew K. Bilyard!, Henry J. Bailey?, Lluis Raich’, Maria A. Gafitescu!’, Takuya Machida’, Javier Iglésias-Fernandez>°, 
Seung Seo Lee!», Christopher D. Spicer!, Carme Rovira**, Wyatt W. Yue?* & Benjamin G. Davis!* 


Biosynthesis of glycogen, the essential glucose (and hence energy) 
storage molecule in humans, animals and fungi’, is initiated by 
the glycosyltransferase enzyme, glycogenin (GYG). Deficiencies 
in glycogen formation cause neurodegenerative and metabolic 
disease*~4, and mouse knockout? and inherited human mutations® 
of GYG impair glycogen synthesis. GYG acts as a ‘seed core’ for the 
formation of the glycogen particle by catalysing its own stepwise 
autoglucosylation to form a covalently bound gluco-oligosaccharide 
chain at initiation site Tyr 195. Precise mechanistic studies have so 
far been prevented by an inability to access homogeneous glycoforms 
of this protein, which unusually acts as both catalyst and substrate. 
Here we show that unprecedented direct access to different, 
homogeneously glucosylated states of GYG can be accomplished 
through a palladium-mediated enzyme activation ‘shunt’ process 
using on-protein C-C bond formation. Careful mimicry of GYG 
intermediates recapitulates catalytic activity at distinct stages, which 
in turn allows discovery of triphasic kinetics and substrate plasticity 
in GYG’s use of sugar substrates. This reveals a tolerant but ‘proof- 
read’ mechanism that underlies the precision of this metabolic 
process. The present demonstration of direct, chemically controlled 
access to intermediate states of active enzymes suggests that such 
ligation-dependent activation could be a powerful tool in the study 
of mechanism. 

The initial anchor point for the dendron-like structures that make 
up glycogen is the Tyr195 residue of GYG (using GYG1 numbering); 
glycogenesis is therefore a striking example of «-linked protein auto- 
glucosylation’. Prior studies have suggested GYG to be a dimeric’, 
Mn?*-dependent enzyme belonging to the GT-8 family of retain- 
ing glycosyltransferases”!°. GYG is—by virtue of its self-modifying 
nature—non-identical for each glucosylation step; that is, GYG, unlike 
nearly all biosynthetic enzymes, is strictly not a catalyst since it is itself 
changed at each step (Extended Data Fig. 1). This leads potentially to 
altered activity for each intermediate state and presumably to eventual 
inactivity once ‘buried’ in the polymer of glucose units that will emerge 
to make up the extended glycan portion of glycogen. This opens up the 
unusual possibility of distinct subphases and mechanisms occurring at 
different oligosaccharide chain lengths; crystal structures suggest possi- 
ble intra-monomeric and inter-monomeric glucosylation modes within 
the GYG protein dimer!®. Although bespoke biosynthetically deficient 
expression host strains can generate’! a glycan-free, starting form of 
GYG, this allows access to only one catalyst state (Supplementary 
Text). As a result, any possible ‘(sub)phases’ subsequent to this starting 
state may be obscured if they follow faster kinetics. A lack of access 
to homogeneous GYG catalyst states therefore restricts our current 
understanding. 

We reasoned that chemical construction of pure GYG in different 
glucosylation states might allow a strategy for direct, guided (‘shunted’) 
activation (and hence interrogation) of chosen intermediate states 
(Fig. 1, Extended Data Figs. 1, 2a). The unusual hybrid nature of these 


catalyst states—part-catalyst, part-substrate—suggested a convergent 
(tag-and-modify’”) construction process in which the desired (glycosyl 
acceptor) glycan moiety would be covalently attached in one step to key 
catalytic site 195 (Extended Data Fig. 1c). We have previously demon- 
strated that Pd(0)-mediated C-C-bond-forming ligation is feasible and 
benign in certain biological contexts'*"!”. Pd-mediated approaches in 
biology have since been elegantly exploited by various groups'*”°. 
However, GYG is a testing target biomolecule on which to apply this 
method. Not only is site 195 in the heart of the active site, but GYG is 
also metal-dependent, raising the possibility of inhibitory ‘poisoning’ 
cross-competition”!” by Pd at the metal co-factor site. 

A suitable precursor GYG]1 bearing a reactive ‘tag’ for Pd(0)-mediated 
C-C bond formation was generated via site-specific unnatural amino 
acid incorporation'*”?*, giving a variant in which the p-hydroxy 
group of the natural, wild-type (WT) tyrosine residue at site 195 was 
exchanged for an iodide moiety (OH — I, GYG-Tyr195 — GYG- 
piPhel195, Fig. 1). Characterization confirmed no deleterious effects 
on overall enzyme structure. The structure of GYG-Y195X, deter- 
mined in both apo (2.2 A) and Mn?+ + UDP bound (2.4 A) states 
(Supplementary Table 1; UDP, uridine diphosphate), revealed dimers 
that were highly superimposable on those in GYG-WT"® (Extended 
Data Fig. 2c). In the ligand-bound state, the pIPhe195 group from 
one monomer is clearly visible (Extended Data Fig. 2d, inset), located 
within a partially unwound helix that adopts a catalytically poised 
position equidistant from either active site of the dimer (Extended 
Data Fig. 2d, red). Asymmetry at the dimer interface, consistent with 
previous unglucosylated GYG-WT structures", suggested likely con- 
formational flexibility needed as GYG transitions from unconjugated 
to differently glucosylated forms. 

Studies on wild-type GYG (GYG-WT/GYG-Tyr195) revealed con- 
centration-dependence of Pd inhibition and hence determination of 
essentially benign Pd concentrations that would successfully allow pres- 
ervation of enzymatic activity (Extended Data Fig. 3, Supplementary 
Note); other cross-coupling components had minimal effect. These 
conditions allowed successful Pd-mediated C(sp”)—C(sp’) ligation of 
GYG-pIPhe195 to a variety of designed, systematically altered ‘substrate 
templates’ (Fig. 1, Extended Data Figs. 2a, 4); all bore nucleophilic, 
hydroxyl groups as possible reaction sites for autoglucosylation (readily 
prepared as their corresponding C(sp’) boronic acid derivatives 1, 
see Supplementary Methods). Small amounts of side-products were 
also identified (Extended Data Fig. 5): for example, unreacted GYG- 
plPhe195 or species attributable to dehalogenation”” using liquid chro- 
matography-mass spectrometry (LC-MS) analysis and negative control 
studies (Supplementary Methods and Supplementary Text). Despite 
successful Pd-mediated ligation, ‘simple’ glycan-mimic templates 
(Extended Data Figs. 2a, 4) provided ineffective mimicry: irrespective 
of their systematically varied properties (orientation, length or pK,), 
none led to activation of autoglucosylation. Activation of GYG requires 
therefore more than just an available hydroxyl nucleophile positioned 
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Fig. 1 | Palladium-mediated C(sp”)-C(sp”) ligation as a strategy for 
mechanistic investigation of glycogenin. First transformation, amber 
codon suppression enables ‘OH — I replacement of the native Tyr195 
acceptor of GYG-WT with an unnatural L-p-iodophenylalanine residue. 
This GYG-pIPhe195 enzyme, which lacks a native glycosyl acceptor and 
thus cannot undergo glucosylation, represents a suitable substrate for Pd- 


in the active site. However, for more complex substrate templates dis- 
playing glucosyl moieties inside GYG, not only did protein LC-MS 
analysis reveal successful C-C ligation but also concomitant activa- 
tion (‘switch or) and clear autoglucosylation activity in the resulting 
‘shunted intermediate product GYG-Glc as well as the more advanced, 
chain-extended ‘shunted’ states GYG-Glc-Glc and GYG-Glc¢ (Fig. 2, 
Supplementary Information). Side-products from cross-coupling 
(Extended Data Fig. 5) were inactive to autoglucosylation and thus 
did not interfere in the assay. 

This chemically generated access to ‘shunted’ functionally active 
intermediate states of GYG along the glycogen biosynthetic path- 
way allowed us to uniquely probe and compare activity using LC-MS 
monitoring of the sugars attached over time (Fig. 2c, d, Extended 
Data Figs. 6, 7, Supplementary Methods). Immediately contrasting 
behaviours from different states were observed. For more extended 
GYG-Glc-Glec, two distinct glucosylation phases were apparent: rapid 
glucosylation from 2 until about 4-5 Glc total, then substantially 
slower catalysis thereafter (Fig. 2c, d). Indeed, the initial step (GYG- 
Glc-Gle — GYG-Glc-Glc-Glc) was extremely rapid; on-protein kinetic 
analyses conducted in replicate (see Supplementary Methods) revealed 
that about 90% of starting GYG-Gle-Glc was consumed within 20 s. 
In striking contrast, GYG-Glc exhibited a more gradual decline in 
glucosylation rate with increasing oligosaccharide length (Fig. 2c, d), 
consistent with a substantially slower initiation subphase for GYG-Glc 
(GYG-Glc — GYG-Glc-Glc) that thus obscures the rapid phase imme- 
diately following (Extended Data Fig. 6d). Taken together, these data 
suggested a triphasic mechanism, in which a rapid intermediate phase 
is flanked by slower initiation (<2 glucoses) and elongation (>4/5 glu- 
coses) phases (Extended Data Fig. 6d). Notably, only through the direct 
‘shunt’ formation of intermediates (GYG-Glc, GYG-Glc-Glc, and so 
on) achieved through Pd-mediated ligation was unobscured analysis 
of each subphase made possible (Extended Data Fig. 7). Clear visual- 
ization of this kinetic profile was a consequence of our ability to both 
circumvent initial slow Tyr195 glucosylation and also probe discrete 
glucosylation states immediately after this. The presence of distinct 
(sub)phases is consistent with the proposed'®>”° existence of different 
glucosylation mechanisms for GYG. 

Use of ‘shunted’ intermediates GYG-Glc, GYG-Glc-Gle and GYG- 
Glecg allowed the determination of initial rates that gave apparent rate 
constants for each associated phase of kapp = 0.016, 0.126 and 0.003 st, 
respectively (Extended Data Fig. 7b). These were also compared 
directly with kinetics determined from analysis of wild-type GYG 
in unglucosylated form (GYG-WT-Glc0, Extended Data Fig. 7). As 
expected, the inability to access intermediate states for GYG-WT failed 
to reveal the distinct phases shown by our chemically ‘shunted’ system. 
Nonetheless, global values for turnover proved consistent; we now show 
that one consequence of the triphasic regime is an accumulation of 
glucosylation at the end of the fast phase 2 mechanism regime (lengths 
5-6 Glc) going into the slower phase 3. Taken together, this confirmed 
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ae C-C bond 
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mediated Suzuki-Miyaura cross-coupling (second transformation) to a 
range of boronic acid sugar mimic templates, to generate potentially active 
enzyme species that mimic defined GYG glycoforms. In this way, inactive 
GYG-pIPhe195 might be activated through C-C bond-forming ligation 
allowing pre-determined, ‘shunted’ access to intermediate catalyst states of 
GYG. See Extended Data Figs. 2, 4 for templates. 


quantitative mimicry at similar activity levels and highlighted the need 
for the chemical ‘shunted’ approach in revealing detailed mechanism. 

Quantum mechanics/molecular mechanics (QM/MM) metadynam- 
ics”””8 simulations (see Supplementary Methods) allowed further insight 
through detailed reconstruction of the free-energy surface of reaction as 
a function of a few selected degrees of freedom (collective variables, CVs; 
Supplementary Methods). Michaelis complexes equivalent to GYG-Gle- 
Glc — GYG-Glc-Glc-Gle (both in WT, GYG-WT-Glc3 — GYG-WT- 
Glc4, and ‘shunted? GYG-Glc-Glc — GYG-Glc-Glc-Glec, form) were 
reconstructed from the structures determined here and of those in com- 
plex with UDP-Glc and cellotetraose!®. Both wild-type and ‘shunted’ 
forms gave similar results (Extended Data Fig. 8), consistent with kinetic 
parameters. The free-energy surface revealed a short-lived intermediate 
(Extended Data Fig. 9) along the minimum free-energy pathway indic- 
ative of a front-face, ‘Syi-like’ reaction mechanism (see Supplementary 
Video)”?**. The free-energy barrier of approximately 10 kcal mol! 
was very low compared with typical values obtained previously*° for 
similar ‘Syi-like’ glucosyl transfer reactions (about 20 kcal mol’), sug- 
gesting potentially different rate-limiting processes. Thus, together our 
kinetic experimental and QM/MM data reveal unprecedentedly fast 
glycosyl-transfer for the second subphase of glycogen formation. The 
Michaelis complex (R’ in Extended Data Fig. 9b) exhibits a near-perfect 
approach between the O4’-H acceptor bond and the C1—-Op donor bond 
to assist the departure of UDP. The resulting very short C1---O04’ and 
H---Op distances (3.3 and 2.0 A, respectively, compared with 3.2 and 
2.5 A in prior, representative systems”) for formed bonds provide excel- 
lent stabilization of charge developed at the phosphate, together with 
proper orientation for forthcoming front-face nucleophilic attack of 04’ 
onto Cl of Glc. The acceptor O-H in GYG thus creates a direct hydro- 
gen bond H.---Op, unlike prior systems, resulting in a more stretched 
sugar-phosphate bond (C1-Op) in GYG (1.58 A compared to 1.51 A; 
ref. °°) with a much lower associated bond energy (about 10 kcal mol! 
compared to approximately 18 kcal mol~'). 

To probe the selectivities of this multiphasic GYG mechanism, we 
next investigated the potential of GYG to use non-glucose sugar sub- 
strates*’. The potential for GYG to use non-glucose acceptor sugar 
moieties has not been examined owing to the inability, until now, to 
directly access requisite intermediate enzyme states and to insert into 
those states non-glucose sugars. GYG-Glc and GYG-Glc-Gle generated 
by Pd-mediated ligation were capable of using the non-glucose donor 
sugar UDP-galactose with kinetic profiles essentially qualitatively sim- 
ilar to analogous autoglucosylation reactions (Fig. 3a), thereby forming 
GYG-Glc-(Gal),, and GYG-Glc-Glc-(Gal),,. Notably, however, the third 
kinetic (sub)phase observed for autoglucosylation was curtailed for 
autogalactosylation (Extended Data Fig. 6). Shunted access to GYG 
bearing common non-glucose but naturally occurring mammalian 
monosaccharides p-galactose (GYG-Gal) and p-mannose (GYG- 
Man) (Fig. 3b, Supplementary Information, using boronic acid reagents 
1-Gal, 1-Man) revealed that both were capable of autoglucosylation to 
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Fig. 2 | Generation of homogeneously glucosylated, catalytically active 
GYG glycoforms and kinetic studies of GYG-Glc and GYG-Glc-Glec. 

a, Pd-mediated C-C bond-forming ligation of glucose-derived boronic 
acid 1-Glc to GYG-Y195X generates in good yield the homogeneous 
glycoform GYG-Glc, which exhibits catalytic activity, as shown by LC-MS 
analysis. Similar results were obtained for at least four independent 
repeats. In all cases, non-glucosylated side-products present show no 
activity in the assay. b, Cross-coupling to 1-Glce-Glc instead enables 
direct, ‘shunted’ access to a further catalytic intermediate of GYG-Glc, 
GYG-Glc-Glc, which also proved catalytically active, as also shown by 
LC-MS. Similar results were obtained for at least five independent repeats. 
In all cases, non-glucosylated side-products present show no activity in 
the assay. c, d, Kinetic profiles of overall glucosylation (c) and the initial 
glucosylation step as monitored through consumption of starting enzyme 
(d and inset) for GYG-Glc and GYG-Glc-Glc. The glucosylation levels 
and abundance of the starting enzyme were determined through LC-MS 
analysis (see also Extended Data Fig. 6 and Supplementary Methods). 
Whereas GYG-Glc-Glc exhibits a marked ‘fast — slow’ biphasic profile, 
these same phases, while necessarily present for GYG-Gle, are not visible, 
being instead obscured by a slower initiation step. For both c and d, data 
points represent mean averages of n independent replicate kinetic runs; 
n=4 (GYG-Glc) and n=5 (GYG-Glc-Glc). Error bars are +1 s.d. 


form both GYG-Gal-(Glc),, and so on and GYG-Man-(Glc), and so 
on (Fig. 3c). Kinetic analyses of this non-glucose acceptor activity of 
GYG revealed glucosylation rates for GYG-Gal and GYG-Man that 
are initially lower as a consequence of a substantially slower initiation 
step/(sub)phase. In contrast to their plasticity towards glucosylation, 
the non-glucose enzyme states GYG-Gal and GYG-Man did not cat- 
alyse autogalactosylation to any substantial extent (Supplementary 
Table 12). Molecular dynamics (MD) simulations (Extended Data 
Fig. 9) suggested that the altered configurations of non-glucose sug- 
ars—for example, Gal in GYG-WT-Glc-Gal or UDP-Gal—necessitated 
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Fig. 3 | Donor and acceptor plasticity of GYG. a, GYG-Glc (top row) 
and GYG-Glc-Glc (middle row) are capable of using the unnatural donor 
UDP-galactose, as shown by LC-MS. Bottom row, kinetic profiles for 
overall galactosylation (left) and rate of initial galactosylation step (right). 
Data points represent mean averages of n = 3 independent replicate kinetic 
runs for both GYG-Glc and GYG-Glc-Gle galactosylation. Error bars are 
+1 s.d. b, Generation of non-natural GYG glycosyl acceptors GYG-Gal 
and GYG-Man (bottom and top respectively), as shown by LC-MS. 
Similar results were obtained for at least three independent repeats. 

c, Autoglucosylation activity of GYG-Man (top row) and GYG-Gal 
(middle row) compared to GYG-Glc, as shown by LC-MS. Bottom row, 
kinetic profiles analogous to those in a, overall glucosylation (left) and 
rate of initial glucosylation step (right) are shown. Data points represent 
mean averages of n independent replicate kinetic runs; n = 4 (GYG-Glc) 
and n = 3 (GYG-Gal, GYG-Man). Error bars are +1 s.d. In all cases, non- 
glycosylated side-products present show no activity in the assay. 


slight reorientations but could be accommodated without substantially 
altering the interactions at the active site with key hydroxyl-binding 
residues. The result is that the distance of the putative nucleophile 
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Fig. 4 | Natural and unnatural pathways of GYG catalysis further 
delineate a triphasic mechanism and reveal a possible proof-reading 
step. a, Motion of Tyr195 to accommodate acceptor substrates of various 
lengths and conformations (intra- and inter-monomeric). Results were 
obtained from MD simulations for each Michaelis complex. Acceptor 
sugar units have been omitted for clarity. The orange loop corresponds to 
the acceptor arm of the same subunit of the displayed active site (that is, 
intra), whereas the white loop is the acceptor arm of the opposite subunit 
(inter). The tyrosine residue represented as transparent indicates an 
unstable conformation due to steric hindrance with the ‘blocking loop’ 
coloured in blue. Notice that the tyrosine residue recoils one position 

for each sugar that is attached to it. Hydrogen atoms and acceptor 
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glucose units have been omitted for clarity. b, Comparison of the natural 
autoglucosylation pathway (and the unnatural autogalactosylation 
pathway for various GYG substrates) reveals that, while the slower first 
and third phases (which we speculate operate through inter-monomer 
multimer modes; see right hand column schematic) display limited Gal- 
Gal transfer, this reaction readily proceeds throughout the fast second 
phase (which we speculate is intra-monomer). The consequent absence 
of a third phase for autogalactosylation may function as a ‘refining’ step, 
preventing incorporation of poly-Gal oligosaccharides into glycogen and 
thus preventing accumulation of misformed, potentially toxic, glycogen 
particles. Phase characteristics are summarized on the left. 
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(OH-4) from the electrophilic anomeric carbon (in UDP-Gal or UDP- 
Glc, respectively) is not greatly perturbed (O---C1 change <0.5 A) and 
O-C bond formation can thus evolve essentially ‘normally’ despite such 
changes, reflecting this experimentally observed plasticity. 

Taken together, distinct mechanistic phases of GYG (Fig. 4) 
are therefore defined not only by different rates but also different 
donor/acceptor tolerance. Whereas the second, rapid phase (2 to 4/5 
sugars) readily tolerates Gal-to-Gal transfer throughout (species with 
up to 5 sugars are quickly generated from GYG-Glc-Glc), the first and 
third phases show similarities in being linked by not only their slower 
glucosylation rates but also their apparent lower tolerance of non- 
glucose in both acceptor and donor at the same time. A plastic and 
rapid second phase is thus seemingly preceded by a slower step that 
can nonetheless be primed with unnatural sugars—immediately 
surprising given the presumed specific role of glycogen as a glucose- 
storage polymer—and is followed by a slower and much more selective 
third phase. Together these three phases appear to allow ‘priming’ 
with non-glucose sugars in the first phase (for example, Gal, Man) 
followed by more rapid and more plastic ‘extensior in the second 
phase (with either UDP-Glc or UDP-Gal) before a third ‘refining 
phase that ensures use of only glucose in the more extended portions 
of the inner core of glycogen. 

From data gathered here and earlier!®, we speculate that these 
phases may reflect, in part, transitions between intra-monomeric and 
inter-monomeric modes of glucosylation within the active GYG protein 
dimer. From our structure of GYG-pIPhe195, we see that the anchor 
point for the oligosaccharide chain of glycogen is essentially equidistant 
from the two active sites in GYG dimer. MD simulations (Extended 
Data Fig. 8) with GYG bearing Glc-oligomer chains of different lengths 
(GYG-WT-Glc,, 1 = 0-5) and conformations (intra-/inter-monomeric) 
suggest that the first glucosylation steps (n =0, 1) are preferentially 
inter-monomeric. A ‘blocking loop in between the acceptor arm and 
the active site of the same subunit hampers intra-monomeric conforma- 
tions. In contrast, sugar chains of subsequent steps (n = 2, 3) circumvent 
the blocking loop, allowing intramonomeric conformations. A key posi- 
tioning residue GYG-Asp 125 binds the nucleophilic acceptor Glc ter- 
minus, allowing equilibration into a productive Michaelis complex and 
guides the OH-4 to the donor site from the a face of UDP-Glc, ready 
for the front-face attack (optimal for intermediate levels of GYG glu- 
cosylation). Key to this process is a striking flexibility of GYG-Tyr195, 
which steadily recoils step-by-step by the distance of one sugar ring to 
accommodate acceptor Glc, chains of increasing lengths (Fig. 4). 

Together these data suggest a first inter-monomer phase in which the 
nascent oligosaccharide chain is of insufficient length to easily provide 
the right orientation to be processed by the active site but can eventually 
equilibrate (Shooked’ into place by Asp 125) to a productive Michaelis 
complex owing to flexibility of Tyr195. In the second phase, sufficient 
flexibility of the oligosaccharide chain allows correct orientation anda 
rapid intra-monomer extension, yet with low selectivity. Finally, in the 
third ‘refining’ phase, extension of the nascent oligosaccharide chain 
past the active site of its own protein monomer requires extension 
by the active site of another monomer in a much more closely linked 
dimer, which requires careful alignment of donor substrate (UDP-Glc 
only) recruitment with binding of the extending chain. Eventually, this 
chain too processes past the point of the second active site and GYG’s 
activity ceases at a longer chain length of more than 12 Glc units. This 
presents a Glc-terminated core-glycan particle ready for elaboration 
by glycogen synthase (GYS) and glycogen branching enzyme (GBE), 
respectively (Extended Data Fig. 1)”. 

The plasticity of GYG raises the question of whether non- 
glucose sugars can ever be incorporated into mature glycogen particles. 
Whereas natural incorporation of mannose from its most abundant 
nucleotide GDP-mannose is not feasible owing to the known specificity 
of GYG for pyrimidine nucleotide sugar donors*”, UDP-Gal is readily 
available in vivo. In this light, the limited final kinetic phase for autoga- 
lactosylation is consistent with a ‘refining’ mechanism that prevents mis- 
formed glycogen particles due to, for example, poly-Gal incorporation 
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(Fig. 4). At the same time, GYG’s ability to utilize UDP-Gal in earlier 
phases may facilitate early glycogenesis during times in which UDP-Glc 
supplies are scarce. This suggests that the core of glycogen can carry 
priming glycans that may be non-glucose in nature. Our work here also 
highlights that while non-glucose sugars might serve this role, other sim- 
pler, hydroxyl-only templates fail. This, in turn, suggests that this core 
region does not serve as an energy storage polymer (since it would release 
incorrect sugars for metabolism) but instead acts to anchor glycogen 
to the glycogenin core protein. Together, these three phases—prime- 
extend-refine—therefore appear to represent a mechanistic solution to 
the delicate evolutionary balance between the difficult-to-achieve need 
to anchor glucose-polymer to a protein with the need to ensure precise 
glucose-only particle formation at its outer regions. 

The chemical ligation approach used here has shown that, whereas 
natural C-O Tyr195-to-glucose linkages cannot be accessed via any 
current chemical modification approach (Extended Data Fig. 10), 
Pd-mediated formation of an irreversible C-C bond can yield 
sufficiently similar motifs to allow functional mimicry of GYG in 
glycogenesis. They reveal that GYG’s catalytic activity does indeed 
vary through these intermediate states and highlight how this ‘self- 
modulation seems to be exploited by nature in three phases with dif- 
ferent function. We anticipate that this methodology may ultimately 
be expanded to access a wider range of precise glycogen structures, 
enabling study of other glucosylation and associated processing steps 
that will shine further light on the expanding number of glycogen- 
associated diseases’?**?, 

More broadly, the demonstration of successful mimicry that we have 
achieved here by using chemistry to covalently and directly ‘bolt in’ a 
key residue alteration to create an intermediate catalytic state highlights 
that new protein chemistries are becoming accurate and subtle enough 
to allow precise (for example, ‘shunt’) mechanistic experiments that 
would be difficult through classical biochemical means. Although strat- 
egies for chemical rescue of enzymes via unmasking of caged natural 
residues have been elegantly explored’?***», to our knowledge these 
experiments mark rare application of Pd-mediated C-C-bond-forming 
ligation as a mode of chemical enzyme activation. It suggests that such 
ligation-dependent activation (here using catalytic metal Pd(0) as a 
‘switch’) could be a powerful tool not only in the study of mechanism 
but even potentially in the future ‘rescue’ of deficient enzymes. 


Reporting summary 
Further information on experimental design is available in the Nature Research 
Reporting Summary linked to this paper. 
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Extended Data Fig. 2 | Development of chemically addressable GYG 
scaffold as a strategy for mechanistic investigation. a, GYG-pIPhe195 
enzyme (top, middle), which lacks a native glycosyl acceptor and thus 
cannot undergo glucosylation, represents a suitable substrate for Suzuki- 
Miyaura cross-coupling to a range of boronic acids (boxed), to generate 
potentially active enzyme species that mimic defined GYG glycoforms. 
In this way, inactive GYG-pIPhe195 might be activated through C-C 
bond-forming ligation allowing pre-determined, ‘shunted’ access to 
intermediate catalyst states of GYG. b, Expression of GYG-195pIPhe in 
Escherichia coli using a polyhistidine tag removable by TEV cleavage was 
confirmed by LC-MS (shown; y axis, m/z ratio) and showed structural 
similarity to wild-type enzyme. Similar LC-MS spectra were obtained 
for at least three analogous expressions. Circular dichroism plots (one 
shown, with ellipticity versus wavelength) are mean averages of three 
successive measurements of the same sample (n = 1). c, d, Overlay (c) 
and (d) enlarged view of the chain A acceptor arm for the structures of 
GYG-Y195X (Mn?* + UDP; this study) (red), GYG-Phe (Mn?* + UDPG) 
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(yellow), GYG-WT (Glc, + UDP) (blue), and GYG-WT (Glcs + UDP) 
(green). Inset to d, 2F, — F, electron density map for Y195pIPhe of the 
acceptor arm. The acceptor arm in chain B is disordered (marked by 
red-dashed line in c) and probably adopts multiple conformations to 
accommodate the equivalent pIPhe group. e, Evidence of dimer formation 
for GYG in solution. Left, SEC-SAXS signal plot. Each orange point 
represents the integrated area of the ratio of the sample SAXS curve to 

the estimated background. Each blue point shows radius of gyration (Rg) 
estimated from the Guinier region for each frame. Right, main panel, 
logarithmic intensity plot of subtracted and merged SAXS frames. Cyan 
circles represent averaged buffer frames subtracted from averaged sampled 
frames. Black circles represent the median of the buffer frames subtracted 
from the averaged sample frames. Above, aligned, averaged and refined 
DAMMIN ab initio model (grey) superimposed with the dimeric GYG1 
crystal structure (PDB 6eqj) using supcomb. SAXS analysis was performed 
from 16 independent scattering measurements of one biological sample. 
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Reaction2: 
Scavenger1 = Na-EDTA 
Scavenger 2 = DTT 


Mobs: 29565 (-l) 
29784 (cross-coupled, 71%) 
29946 (n=1, 16%) 
30108 (n=2, 13%) 

average: 1.42 


Extended Data Fig. 3 | Effect of Suzuki-Miyaura reagents and 
conditions on autoglucosylation activity and one-pot Suzuki- 
Miyaura autoglucosylation. a, Effect of different Suzuki-Miyuara 
reaction (scheme shown) components on GYG-WT activity. Bottom, plots 
illustrating proportions of glycoforms present before (black) and after 
(red) glucosylation assay, as calculated from the relative peak intensities 
of each glycoform in the corresponding LC-MS spectra, and represent 
single experiments (n = 1) carried out in parallel. Boronic acid and the 
Pd-scavenger DTT (dithiothreitol) did not appear to cause enzymatic 
inactivation in isolation. In the presence of palladium with limited Pd- 
removal/refolding steps, either far more limited activity (Pd + DTT) or 
no activity (all components) was seen. This highlighted that Pd was the 
key issue regarding GYG-WT activity. b, The effect of cross-coupling 
reaction components on GYG-WT structure, as shown by circular 
dichroism analysis. Neither DTT nor boronic acid (m-CH,OH) caused 
any substantial alteration to secondary structure. Palladium catalyst, 
however, caused clear alteration of secondary structure. This effect 
could however be avoided through minimized Pd concentrations and 
thorough Pd-scavenging and removal (see c). Circular dichroism plots 
represent mean averages of three measurements of the same sample 
(n= 1). c, Demonstration of ‘enzyme-compatible’ Pd-mediated ligation. 
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Through minimizing the Pd concentrations employed, and post-reaction 
Pd removal, cross-coupling conditions compatible with retention of 
GYG-WT activity were developed (top row). The plot in the bottom row 
illustrates proportions of glycoforms present before (black) and after (red) 
glucosylation assay, as calculated from relative peak intensities of each 
glycoform in corresponding LC-MS spectra (boxed), and represents a 
single experiment (n= 1; note however that the “1-m-CH OH’ experiment 
in Extended Data Fig. 4 is near-identical, differing only in glucosylation 
time). Circular dichroism (rightmost plot, bottom row) additionally 
confirmed structural similarity of GYG-WT enzyme before and after 
subjection to optimized cross-coupling conditions. See also Extended 
Data Fig. 2 for more details of structural analyses by X-ray crystallography. 
Circular dichroism plots represent mean averages of three measurements 
of the same sample (n= 1). d, One-pot SMC of GYG-Y195X and 
autoglucosylation of GYG-Glc (top row). Autoglucosylation is greater 
when palladium scavenging is carried out before glucosylation assay 
quenching (reaction 1) than vice versa (reaction 2). LC-MS data represent 
single experiments run in parallel (n= 1). e, Average number of glucoses 
added per enzyme upon treatment of GYG-WT in the presence of varying 
final concentrations of Pd (0, 0.1, 0.2, 0.4 mM). Data are mean average of 
three independent replicates (n = 3); error bars are +1 s.d. 
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Extended Data Fig. 4 | Cross-coupling to ‘simple mimics’ and 
assessment of catalytic activity of products. Top row, ‘simple’ substrate 
templates were introduced using aryl boronic acids 1-o-CH2OH, 1-m- 
CH,OH, 1-p-CH,OH (exploring different angles of nucleophile display) 
and 1-m-OH, 1-p-OH (exploring reduced nucleophile length with similar 
angles) (shown in second row). These proceeded with useful to high 
conversions (Supplementary Methods, Section 6) to allow the direct 
creation of systematically altered GYG conjugates bearing substrate 
mimics: GYG-o-CH,OH, GYG-m-CH,OH, GYG-p-CH,OH, GYG- 
m-OH, GYG-p-OH. LC-MS analysis showed that none of the cross- 
coupled products showed autoglucosylation activity (upper scheme, detail 
in Supplementary Methods, Section 6). Irrespective of the systematically 
varied nature of the glycan-mimic substrate templates (orientation, 
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length or pK,), none led to efficient mimicry of substrate moiety and 
hence activation of autoglucosylation. Notably, also, the truncated, 
‘linker-only’ variant GYG-allyl-OH was inactive, highlighting further 
the critical need of an effective mimic moiety for such ‘shunting. Thus, 
despite successful Pd-mediated ligation, these ‘simple’ templates provided 
ineffective mimicry. Identically treated GYG-WT, run in parallel as a 
positive control in each case, showed detectable activity (central scheme). 
Bar charts (bottom two rows) are graphical representations of the LC-MS 
data for treated GYG-WT, showing abundances of each GYG glycoform 
before (black) and after (red) autoglucosylation assay. Bar charts are 
representations of single LC-MS experiments (n = 1, Supplementary 
Tables 13, 14). 
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Extended Data Fig. 5 | Proposed species observed during cross- 
couplings to GYG-Y195X and possible mechanisms responsible for 
their formation. a, Left to right, unreacted GYG-Y195xX (in certain cases), 
cross-coupled product, de-iodination product, and a species observed 
uniquely in carbohydrate couplings and proposed to result from ‘reductive 
substitution of the carbohydrate moiety (with ‘hydride’ probably from 

a hydrido-palladium species). b, Formation of cross-coupling side- 
products is illustrated using coupling to boronic acid 1-Glc as an example. 
The relationship of these possible mechanisms to that of productive 
Suzuki-Miyaura cross-coupling is highlighted. The generally accepted 
mechanism for de-iodination (red) involves coordination of a B-hydride- 
containing ligand to Pd following the initial oxidative addition step of 
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cross-coupling; subsequent (}-hydride elimination affords a hydrido- 
palladium species, reductive elimination from which affords de-iodinated 
side-product. ‘Reductive substitution, seen here for carbohydrate systems 
only (blue), could be rationalized by the well documented ability of Pd 

to cleave Cayi-O bonds, including those of allylic glycosides such as 
GYG-Glc, to form a 1-allyl species. Quenching of the latter with the same 
hydrido-palladium complex as invoked in de-iodination—such use of a 
‘hydride scavenger’ in Pd-catalysed de-allylation is a well documented 
process**38__would result in a ‘reductive substitution’ product, that is, 
replacement of the carbohydrate moiety (here, glucose) with hydride. Blue 
sphere, Glc, ‘Ar-I, pIPhe195 of GYG. 
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Extended Data Fig. 6 | Distributions of glycoforms in assays of GYG 
variants and the proposed triphasic mechanism. a, Species with up to 
12 glucose sugar residues attached are observed during autoglucosylation 
of GYG-WT-0Glc after 900 s reaction time (the key shows reaction time). 
Glc-0 is slow to decline, while accumulation of later glycoforms (for 
example, Glc-7, Glc-8) is observed. This is consistent with the slow- 
fast-slow profile observed for the cross-coupled system, highlighting 

the relevance of the latter. b, Species with up to 13 sugars attached 

are observed during autoglucosylation of GYG-Gle (left) and 


a 


{e+e eT = 


n = 2-3 (4-5 glucoses total) 


@-—vr 


GYG-Glc-Glec (right). c, No species with more than 8 sugars is seen during 
autogalactosylation of the same enzymes even after 120 min reaction time. 
Data represent mean averages from n independent replicate kinetic assays 
(n=4 for GYG-Glc glucosylation, n =5 for GYG-Gle-Glc glucosylation, 
n=3 for all others). Error bars are +1 s.d. d, Proposed triphasic 
mechanism inferred from kinetic experiments. GYG-Glc-Glc exhibits only 
two distinct phases (fast — slow); the slower initiation step for GYG-Glc 
represents an additional first (slow) phase before reaching the disaccharide 
of GYG-Glc-Gle. 
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Extended Data Fig. 7 | Kinetic analyses and comparison of GYG- are shown as mean + s.d. for n independent replicate kinetic assays (n= 4 
WT-0Glc with ‘shunted’ GYG-glycomimetics. a, Left, reaction scheme for GYG-Glc, n =5 for GYG-Glc-Glc, n = 3 for GYG-WT-0Glc, n =2 for 
(top) with R variants under (boxed). Right, overlayed kinetic profiles GYG-Gleg). c, To ensure that there were no potential artefactual catalytic 
(left plot) and initial rates (right plot) for GYG-WT-0Glc, GYG-Gle, effects from a de-iodinated side-product giving rise to this previously 
GYG-Glc-Gle and GYG-Glcg. Data are shown as mean +s.d. from n unobserved rapid phase via intermolecular glucosylation, we also explored 
independent replicate kinetic assays (n =4 for GYG-Gle, n=5 for GYG- its effect when added in pure form to reactions (scheme shown left); rather 
Glc-Glc, n= 3 for GYG-WT-0Glc, n = 2 for GYG-Glcg). b, Apparent rate than any enhancement it gave rise only to slight suppression, thereby 
constants for autoglucosylation of GYG-WT-0Glc, GYG-Glc, GYG-Glc- discounting this possibility. Observed rates (k-values) are essentially 

Glc and GYG-Glcg¢ allowed us to re-construct, using autoglucosylation independent of levels of GYG-Y195F, which lacks native acceptor capacity, 
kinetic parameters of each of the ‘shunted’ glycomimetics (right), an highlighting that our conclusions are not influenced by such cross- 
autoglucosylation profile in good agreement with the GYG-WT-0Glc coupling side products. Data are shown as mean + s.d. for n independent 
kinetic data (left: top overlay, profile; bottom overlay, nonlinear fit). Data replicate kinetic assays (n = 3). 
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Extended Data Fig. 8 | Glycogenin dynamics, glycoform mimics and 
active site structure considering acceptors of different length. a, Root 
mean square fluctuation (RMSF) of the enzymatic Ca atoms (top). Results 
are obtained from the MD simulations of the intra ‘UDP-Glc + GYG-WT- 
Glc3’ Michaelis complex. Coloured segments in plot correspond to the 
regions coloured in the structure below (see Supplementary Information 
for details). b, Top, modelled complex of the GYG glycoform mimic 

(right column), in comparison with the WT complex (left column). Blue 
balls represent Glc units, and the red and thin rectangle represents the 
allyl moiety. Bottom, normalized distributions of the donor-acceptor 
C1-04 distances. Both inter and intra conformations gave similar results. 
c, Structural superposition of intra and inter conformations for ‘UDP- 

Glc + GYG-WT-Glcn’ complexes, with n =0 to n=5 Glc units (six left- 


Distance (A) Distance (A) 


hand panels). The orange loop corresponds to the acceptor arm of the 
same subunit of the active site that is displayed (that is, intra), and the 
white loop is the acceptor arm of the opposite subunit (inter). The protein 
loop coloured in blue hinders the approach of short acceptors in intra 
conformations. Specifically, the loop clashes with Tyr195 for then =0 
(intra), causing it to move away from the donor, as indicated by the black 
arrow. The n = 1 (intra) is also affected by the loop, as reflected in the shift 
of the corresponding C1-O4 frequency peak. Frequency distributions 

are shown in the respective six right-hand panels. Each distribution was 
obtained from 0.4 1s of simulation data. The maximum frequency peak 
for intra/inter conformations correspond to 6.4/3.8 A (n= 0), at 3.6/3.2 A 
(n=1), 3.3/3.3 A (n=2) and 3.2/3.2 A (n=3). Hydrogen atoms have been 
omitted for clarity. 
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Extended Data Fig. 9 | Glycogenin plasticity and simulations of the 
glucosylation reaction mechanism. a, Modelled intra ‘UDP-Glc + (Glc)2- 
Tyr195° complexes (panels on left) and normalized distribution of the 
C1-04 distances considering donor and acceptor Gal variants (panels on 
right). The change of Glc (blue) to Gal (yellow) in the acceptor displaces 
the reactive hydroxyl by 0.5 A (from 3.3 A to 3.8 A). The Gal modification 
at the donor site displays alternative conformations (not shown) in which 
the OH-2 and OH-3 substituents interact with D163. Hydrogen atoms 
have been omitted for clarity. b, Computed free-energy landscape for the 
intra ‘UDP-Glc + GYG-WT-Glc3’ reaction catalysed by GYG (contour 


Glc 
(acceptor) 
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, 


lines at 1 kcal mol~'; left panel) and atomic rearrangement along the 
reaction pathway (six panels on the right). Hydrogen atoms have been 
omitted for clarity, except OH-2, OH-3 and OH-4 of the acceptor sugar, 
the OH-2 of the donor sugar and those of the side-chain amide NH of 
Q164. Bonds being broken/formed are represented as dashed red lines 
(snapshots 1 and 3). c, Hydrogen-bond interactions (dashed lines; from 
PDBs 3T70, 3U2V and 3U2U) that were restrained during the first steps 
of the initial classical MD simulations. Q164, interacting with the acceptor 
OH-3, is not shown for clarity. 
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Reassessing evidence of life in 3,700-miullion-year- 


old rocks of Greenland 


Abigail C. Allwood!*, Minik T. Rosing?, David T. Flannery!, Joel A. Hurowitz** & Christopher M. Heirwegh! 


The Palaeoarchean supracrustal belts in Greenland contain Earth’s 
oldest rocks and are a prime target in the search for the earliest 
evidence of life on Earth. However, metamorphism has largely 
obliterated original rock textures and compositions, posing a 
challenge to the preservation of biological signatures. A recent 
study of 3,700-million-year-old rocks of the Isua supracrustal belt 
in Greenland described a rare zone in which low deformation and 
a closed metamorphic system allowed preservation of primary 
sedimentary features, including putative conical and domical 
stromatolites! (laminated accretionary structures formed by 
microbially mediated sedimentation). The morphology, layering, 
mineralogy, chemistry and geological context of the structures were 
attributed to the formation of microbial mats in a shallow marine 
environment by 3,700 million years ago, at the start of Earth’s rock 
record. Here we report new research that shows a non-biological, 
post-depositional origin for the structures. Three-dimensional 
analysis of the morphology and orientation of the structures 
within the context of host rock fabrics, combined with texture- 
specific analyses of major and trace element chemistry, show that 
the ‘stromatolites’ are more plausibly interpreted as part of an 
assemblage of deformation structures formed in carbonate-altered 
metasediments long after burial. The investigation of the structures 
of the Isua supracrustal belt serves as a cautionary tale in the search 
for signs of past life on Mars, highlighting the importance of three- 
dimensional, integrated analysis of morphology, rock fabrics and 
geochemistry at appropriate scales. 

Earth’s earliest fossil assemblages are important for understanding 
the origins of life on Earth and, by analogy, how and where to search 
for signs of primitive life in the rock record of other planets”. The oldest 
widely accepted evidence of life on Earth is in marine metasedimen- 
tary rocks of the Pilbara Craton, Australia, in the form of a microbial 
stromatolite reef** and fossil biofilms? of the 3,450-million-year-old 
(Myr) Strelley Pool Formation. Putative microfossils® (Strelley Pool 
Formation) and stromatolites (3,490-Myr-old Dresser Formation)’ 
also occur in the Pilbara Craton, but their biogenicity is equivocal®. In 
Greenland, geochemical features compatible with microbial activity 
exist”, but their interpretation has been questioned!!"’. The presence 
of 3,700-Myr-old stromatolites in Greenland’s Isua supracrustal belt 
(ISB), if true, would represent an entirely new and compelling type of 
biosignature in Earth's oldest rocks and establish the start of the fossil 
record 200 Myr earlier than previously thought’. 

The putative stromatolites, discovered approximately 150 km 
northeast of Nuuk (Extended Data Fig. 1), were described! as elongate 
cones and domes 1-4 cm high, with apices pointing upward relative 
to overturned sedimentary bedding. Combining those attributes with: 
(1) internal stromatolitic lamination that is continuous across the crests 
of the structures; (2) diverse morphologies similar to younger strom- 
atolites; (3) associated shallow-water sedimentary features, including 
sedimentary onlap; (4) differences in chemical composition inside the 
structures compared to surrounding sedimentary rock; (5) the presence 
of low temperature dolomite; and (6) seawater-like rare earth element 


and yttrium (REE + Y) composition of the dolomite, it was proposed 
in the previous study! that the structures are stromatolites produced by 
microbial communities in a shallow marine, carbonate-platform envi- 
ronment similar to the stromatolites in the Strelley Pool Formation*”. 
Preservation of these features was attributed to an approximately 
30-m x 70-m low-deformation lacuna in the hinge of an anticline’. 

We located the discovery outcrops using data from the previous 
study’. Site A consists of brownish-grey layered dolomitic rock, with 
light-grey triangular features mostly oriented apex-up relative to 
the overturned layering. However, some are apex-down (Fig. la and 
Extended Data Fig. 2c), which is inconsistent with upward growth of 
the structures from a palaeo-seafloor. Dolomitic breccia nearby (site 
C) was previously interpreted as a tempestite (storm deposit), which 
in turn was taken as evidence of a shallow-water, ice-free sedimentary 
environment!. However, a wider view of the outcrop shows ductile and 
brittle deformation of the clasts, including extreme elongation when 
viewed from the side (Extended Data Fig. 3), indicating that the breccia 
has a tectonic origin and has no bearing on water depth, ice or other 
sedimentary conditions. 

A sample was acquired approximately 0.5 m from the original 
‘stromatolite’ sample site of site Al“, including one of the triangular 
structures (Extended Data Fig. 2). Cut parallel to the weathered face 
(face 1, Fig. 1c-e), the sample shows irregularly layered light- and 
medium-grey quartz—dolomite layers with dark micaceous layers and 
foliation. An array of millimetre- to centimetre-scale convex-up fea- 
tures, of which the triangle structure is the largest, all have subparallel 
axial planes (Fig. 1c, d). Notably, the base of the triangular structure is 
also convex-up and conformable with small convex-up features in the 
underlying quartzose layers. The fabric is extensively disrupted by pla- 
nar discontinuities, or spaced cleavage, subparallel to the axial planes of 
the convex-up features (Fig. 1c). By contrast, when viewed orthogonally 
(face 2, Fig. 1b), the rock shows flat, even layering without any strom- 
atolites, bumps or irregularities (Fig. 1b, e and Extended Data Fig. 4). 

Such orientation-dependent, contrasting fabrics are inconsistent with 
sedimentary processes. Rather, they are typical deformation fabrics 
found in a multi-layered rock that has been shortened in one direction 
(parallel to layering), producing minor folds, cleavage and other com- 
pressional features similar to those observed on face 1; and substantially 
lengthened in an orthogonal direction, producing extensional rod-like 
features such as those observed on face 2. The type of deformation 
indicated is consistent with the structural setting of the rock, within 
the hinge of an anticline’. 

Accordingly, cuts parallel to face 1 show that the ‘stromatolites’ are 
not cones or elongate cones, but ridges extending at least 10 cm (our 
sampling depth) into the rock, aligned with the lengthening direction. 
The ridges probably extend further, given the extreme elongation of 
the rock fabric observed in the outcrop. Photographs published in the 
previous study! suggest that the structure that they sampled is also 
ridge-shaped. Although ridge morphology alone does not preclude bio- 
logical origins, it is easier to produce ridges abiotically than cones**. 
More importantly, a deformational origin is more plausible given the 
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Fig. 1 | Putative stromatolites of Greenland. a, Seven structures in 
outcrop (arrows, white dashed line). b-d, Sample from site A. b, Face 2 
shows even, parallel layering. c, d, Face 1 shows irregularly layered fabric 
with planar discontinuities (arrows in c) and convex-up features (two 
yellow arrows in d). Yellow dashed boxes indicate panels expanded in g, 
has indicated. e, Oblique view of the sample. f, Sample from the previous 
study', equivalent to face 1. Lines indicate the path of X-ray fluorescence 


alignment of the ridges with the lengthening direction indicated by 
the rock fabrics. 

The previous study! included outcrop photographs of thin, recessive 
laminae that tangentially truncate against a structure, which the authors 
use as evidence of seafloor growth of a stromatolite. However, similar 
truncation occurs in our sample where micaceous foliation terminates 
against the triangular structure—an observation that is consistent 
with the presence of a rigid object (the quartzose ridge) in a ductilely 
deforming rock, leading to deflection and pressure solution of the mica 
and carbonate foliation on the shortening side. Tangentially truncated 
laminae observed in the thin section! occurred at site B. However, the 
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scans in the previous study'. The ‘d’ denotes their scan through the 
stromatolite. g, PIXL element maps of stromatolite and matrix (yellow 
box in c). Dashed line marks the edge of the structure, below which the 
composition shows a gradient from a Ca~Mn-Fe-rich rim to a Si-rich 
interior. h. PIXL maps of the light-grey layer (yellow box in b) show 
elemental composition, including Ti and K depletion, identical to the 
‘stromatolite. 


putative stromatolitic structures from site B illustrated in the previous 
study’ are very different from those at site A: the published image only 
shows an undulose lithologic contact (Fig. 2b of the previous study’). 
PIXL (planetary instrument for X-ray lithochemistry) micro-X-ray 
fluorescence maps of elemental composition cast new light on putative 
evidence for biological activity’. First, maps of the distribution of the 
elements calcium, iron and manganese show that ‘stromatolitic lamina- 
tion internal to the structures is actually a dolomitic alteration rim ona 
quartzose interior (Fig. 1g) and that there is no other compositional rel- 
ict of internal lamination in the structures. Second, titanium and potas- 
sium are depleted not only in the ‘stromatolites’ but also throughout the 
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Fig. 2 | Relative abundance of REE + Y for different components of the 
Isua rock samples. Blue and red lines, our sample, which was separated 
into silicate and carbonate fractions. Grey lines, sample analysed in the 
previous study!, combining carbonate and silicate. All have light rare earth 
elements (LREE) element depletion (Pr/Yb of <1), Y/Ho ratio of >30, 


quartzose layers (Fig. 1h), owing to the fact that they have considera- 
bly less potassium- and titanium-bearing mica compared to the dark 
layers. Finally, the iron and silicon maps show iron-rich/silicon-poor 
dark layers and silicon-rich/iron-poor light layers, suggesting that the 
rock may have originally consisted of intercalated cherty and iron-rich 
strata, which is consistent with previous studies and our observations 
of carbonate-altered banded iron formation and cherty metasediments 
in nearby outcrops"». 

In the previous study, low-temperature dolomite formation was 
inferred from C and O isotopes, and this was interpreted as evidence 
of biogenic dolomite formation in the sedimentary environment’. 
However, the role of microbes in low-temperature dolomite formation 
is equivocal!®. Furthermore, low temperature does not preclude sec- 
ondary origins—an hypothesis supported by the presence of dolomite 
alteration rims observed in PIXL maps (Fig. 1g). 

The REE + Y geochemistry of dolomite was mentioned as evidence 
of primary marine carbonate sedimentation in the previous study’. 
However, synchrotron X-ray fluorescence element maps show that 
the 800-j1m-wide laser-ablation inductively coupled plasma-mass 
spectroscopy scans collected by Nutman et al.' would have sampled a 
mixture of dolomite, quartz and micas. Therefore, their REE + Y pat- 
terns cannot be attributed to dolomite alone. Micas, in particular, are 
important trace element carriers in these rocks (Extended Data Fig. 6 
and Supplementary Information). 

To resolve this uncertainty, we separated carbonate and silicate 
(quartz and mica) fractions (Extended Data Fig. 5) by acid diges- 
tion and measured the REE + Y by mass spectroscopy (Fig. 2 and 
Extended Data Table 1). Both have REE + Y patterns broadly con- 
sistent with the properties of Archean to Paleoproterozoic seawa- 
ter'”-!°, However, the overall abundance is higher in the silicates than 
carbonates, and higher in the mica-rich silicate sample than in the 
mica-poor silicate sample. These observations can be attributed to a 
high REE + Y concentration in micas. The silicate REE + Y pattern 
also has a larger positive Eu anomaly than the carbonate, indicating 
a different origin of the silicates compared to carbonate. Given the 
observed carbonate alteration of quartz (Fig. 1g), the most plausi- 
ble interpretation is that the carbonate REE + Y composition was 
inherited from diagenetic and/or metasomatic fluids. In summary, 
the texture-specific distribution of major and minor elements, and 
the REE + Y composition of the rocks are consistent with original 
deposition in a marine environment, followed by secondary carbonate 
alteration any time between early diagenesis” and late carbonate 
metasomatism—the latter process having been well-documented in 
nearby ISB meta-sedimentary rocks!°. 
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positive La and Eu anomalies, broadly consistent with Archean seawater 
origins. However, carbonate and silicate fractions are different in 
abundance (due to mica) and pattern, including a more pronounced 
positive Eu anomaly in the silicate fractions. PAAS, post-Archean 
Australian Shale composite. 


Therefore, we propose that none of the previously published results 
support the interpretation of the ISB structures as stromatolites: they 
lack internal lamination and we found no evidence of synsedimentary 
growth. Their triangular ridge shape is not an indicator of biogenicity 
and they do not exhibit unique chemical compositions that indicate a 
localized microbial influence on the sedimentary processes’. We agree 
that the host rock protolith formed in a marine environment; however, 
there is no evidence for shallow water depth, and there is no unam- 
biguous evidence that carbonate was part of the primary sedimentary 
assemblage. The inherent attributes of the structures, their geological 
setting in a fold hinge, the deformation fabrics observed in the host 
rock, and the shape and alignment of the structures within the over- 
all rock fabrics—all indicate non-biological origins. In our view, it is 
very reasonable to interpret the ISB structures as products of structural 
deformation and carbonate alteration of layered rocks. On the other 
hand, we believe that the current evidence does not support the inter- 
pretation of these structures as 3,700-Myr-old stromatolites. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The experiments were not randomized and the investigators were not blinded to 
allocation during experiments and outcome assessment. 

PIXL analysis at the Jet Propulsion Laboratory. PIXL (planetary instrument for 
X-ray lithochemistry) is a microfocus X-ray fluorescence instrument developed at 
the Jet Propulsion Laboratory to fly aboard NASAs Mars 2020 rover mission. The 
PIXL engineering prototype used in this study employs a Moxtek 12-Watt 60-kV 
MAGPRO Rh X-ray tube, in 180° emission geometry, mated with an XOS (6319) 
glass polycapillary focusing optic. The optic delivers a sub-millimetre (around 
100m diameter at 7 keV) focused X-ray beam spot at a nominal 2-cm stand- 
off distance from the target. Two Ketek Vitus H50 AXAS-D detectors (model: 
D5C2T0-H50-ML9BEV), oriented at 20° relative to the beam axis, provide a 
near-backscatter geometry for optimized X-ray detection. Analogue-to-digital 
conversion of X-ray signals and multi-channel binning of pulses is performed by 
the Ketek-built electronics. Data were acquired using an in-house-designed acqui- 
sition software built using a National Instruments LabVIEW platform. 

Measurements of the Greenland rock were performed in air using 28 kV/100 1A 
X-ray tube-operating conditions. The rock sample was cut, polished and cleaned. 
The data of Fig. 1g were acquired by rastering the X-ray beam across the sample 
surface in x and y directions in 100-|1m steps, with 15-s integration at each step, to 
produce a 10 x 15-mm? image containing 15,000 data points. For Fig. 1h, a 200-m 
step size was selected and a 10 x 20-mm? area, containing 5,000 data points, was 
imaged. 

To accurately identify the elements present, all of the spectra in each map 
were summed together to produce a whole-map summed spectrum. Principal 
elemental peaks of the major elements of interest were identified and peak areas 
under the characteristic X-ray peaks were derived using the in-house software 
package PIQUANT”", designed to process and fit spectral data generated by 
PIXL. PIQUANT uses a rigorously applied linear least-squares spectrum peak- 
fitting approach to ensure robust identification of elements, the fitting routine of 
PIQUANT applies Gaussian functions to each of the X-ray lines that describe a 
peak. The analytic integral of the combined Gaussians produces the net peak inten- 
sity. Peak intensity is one of the fitted variables. Also variable are two parameters 
associated with the peak widths and two more to describe the channel bin-to-peak 
energy conversion. The ‘noise’ corresponding primarily to the bremsstrahlung 
background that underlies the X-ray peaks is fitted separately using a SNIP!7- 
fitting algorithm. The background contribution is subtracted as part of deriving 
the net peak intensities. This approach enables accurate distinction of element 
peaks even when the peaks overlap. In the Greenland rock maps, the Ba La X-ray 
peak (4.47 keV) and Ti Ka X-ray peak (4.51 keV) are an example of this. Both 
appear almost as part of one peak, broadened by the contribution and with 40-eV 
separation of the two elemental lines. With PIQUANT, the individual lines are 
distinguished, given that the line energies are constrained to a fixed energy and the 
peak widths and channel energy calibration parameters are constrained by values 
derived from the dominant lines of neighbouring elements (for example, Ca and 
Fe). Harnessing these constraints enables separation of Ti from Ba. 

One challenge that persists through these spectra is that coherent scatter diffrac- 
tion peaks appear in the energy range of spectra in which Ti and Ba peaks reside. 
Their presence has the potential to be registered falsely as a characteristic X-ray 
response. The PIQUANT software does not yet possess a treatment process that 
would allow for correction of this contribution. Therefore, a very small amount of 
data from this region may represent diffraction scatter instead of a X-ray response. 

PIQUANT utilizes the databases from a previous study”, although a number 
of those databases have been, or are currently being, modified. 

REE + Y analyses at Stony Brook University. Four samples were prepared and 
analysed by inductively coupled plasma mass spectrometry (ICP-MS) for their 
rare earth element (REE) and Y concentrations (REE + Y). A slab of the sample 
shown in Fig. 1b, c was sectioned with a tile saw (Extended Data Fig. 5) to separate 
a portion of the carbonate-rich rim on the light-grey part of the rock (subsample 
‘C) and a portion of the overlying dark-grey part of the rock (subsample ‘M’). 
These two subsamples were hand crushed in a ceramic mortar and then pow- 
dered in an agate shatterbox. Half gram of each powder was weighted out and 
sonicated with 2% nitric acid for 30 min to leach carbonate from the samples. The 
supernatant was then removed, diluted and analysed by ICP-MS; these are sam- 
ples C3C and M3C. About 40 mg (dry weight) of the remaining leached sediment 
was then dried and dissolved in a mixture of hydrofluoric and nitric acid for 12h 
in sealed Teflon vials on a hotplate at around 120°C. They were then dried and 
dissolved in aqua regia for 12 h in sealed Teflon vials on a hotplate at around 120°C. 
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Once completely dissolved, the aqua regia was dried off and the samples were 
reconstituted in nitric acid, which was then diluted and analysed by ICP-MS; these 
are samples C3M and M3M. 

Elemental concentration analyses were performed in the FIRST (Facility 
for Isotope Research and Student Training) Laboratory in the Department of 
Geosciences at Stony Brook University on an Agilent 7500cx quadrupole ICP-MS. 
Samples were diluted to match to the signal of mixed calibration standards and 
unknown concentrations were calculated based on standard calibration curves, 
with standards run frequently between unknowns to monitor for drift in signal 
intensity. The U.S.G.S. Cody shale standard, SCO-1, was used to calculate the 
elemental concentrations for these samples. The REEs and Y concentrations in 
parts per million (p.p.m.) are shown in Extended Data Table 1. Concentrations 
were calculated for the carbonate leach of each sample assuming only carbonate 
was dissolved in 2% nitric acid using Ca and Mg concentrations to calculate the 
carbonate mass dissolved. 

High-resolution X-ray fluorescence mapping at NSLS-II. High-resolution syn- 
chrotron X-ray fluorescence (XRF) spectra were collected on the Sub-micron 
Resolution X-ray spectroscopy (SRX) beamline at the National Synchotron 
Light Source 2 (NSLS-II) at the Brookhaven National Laboratory. The capabil- 
ities of the SRX beamline have previously been described”*™*. In brief, SRX is 
a hard X-ray microprobe that performs scanning micro-fluorescence micros- 
copy (jt-XRF) and X-ray absorption near-edge structure (\1-XANES) analysis 
using the high-brightness NSLS-II as a source of incident radiation. The SRX 
optics allow investigation of elemental distribution and chemical speciation at 
the sub-micrometre scale. For our analyses, we used an incident beam energy 
of 12 keV. The beam was focused to a spot size of 1 j1m and XRF spectra were 
collected for 0.6 s at each spot. Motorized stages were used to move the sample 
under the beam with a 2-1m step size, thus generating a two-dimensional map 
of the elemental composition of this sample. Fluorescent X-rays were detected 
using an energy dispersive X-ray detector (Hitachi Vortex silicon drift detec- 
tor). We analysed two areas, called ‘map 1’ and ‘map 2’ (Extended Data Fig. 6, 
Supplementary Information), on a cut and polished slab sample from locality A. 
This slab contains one of the putative stromatolite-like features, bounded above 
and below by alternating light and dark layers. XRF spectra for map 1 were col- 
lected on a 1-2 mm thick dark-black layer bounded above and below by thicker 
grey coloured layers. XRF spectra for map 2 were collected from within the core 
of the stromatolite-like feature. The dimensions of the maps are 200 x 200 1m? 
and 125 x 1251m?, respectively. To generate element maps, the individual X-ray 
spectra from each map were first summed into single bulk spectra (that is, one 
for map 1 and another for map 2) and fitted using the PyXRF analysis package 
developed at NSLS-II?. PyXRF uses a nonlinear least-squares method to deter- 
mine global parameters such as peak width, energy calibration values, parame- 
ters related to the Compton and elastic scattering peaks, and element identities. 
Once elements were identified in the summed spectra, individual spectra were 
searched for those elements, their peak areas were fitted, and maps of elemental 
distribution and fluorescence intensity were generated. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 

Code availability. PLQUANT software used in the study is available from the 
corresponding authors upon reasonable request. 


Data availability 
The datasets generated during and/or analysed during the current study are avail- 
able from the corresponding authors upon reasonable request. 
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Extended Data Fig. 1 | Satellite image showing the approximate outline and location of the Isua Structural Belt and the study area. The satellite 
image of the study area. The image was obtained from Google Maps. 
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Extended Data Fig. 2 | Putative stromatolites of the ISB at site A. Yellow _ been overturned'. Each of the triangles is approximately 4 cm across. The 
arrows point to triangular shapes with apices mostly pointing down blue box shows the approximate outline of the sample acquired for the 
relative to layering. Note, the stratigraphy was inverted, as the layers have present study. The yellow box shows the area shown in Fig. 1. 
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ee ee 
Extended Data Fig. 3 | Breccia at site C. a, Close-up view of breccia, elongated rod-like fabric (rodding) on the upper right side of the rock. 
from the previous study’. Ch, chert; dol, dolomite. b, Larger field of c, Top view of the breccia-containing block from a—note the contrasting 


view showing the same breccia block as in a, showing the location of the appearance of the rock fabric. 
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Extended Data Fig. 4 | Photographs of details from the deformation the ‘stromatolites’ on the right face. Note contrasting fabrics on adjacent 
fabrics of site A. The photographs show details of the observed faces. b-e, Additional pieces of the rock sample, showing further examples 
deformation fabrics on the cut and polished faces of all three pieces of the of the contrasting rock fabrics. Green arrows on all images indicate the 
columnar sample that we collected from site A. Each piece was cut to show _ orientation of the spaced cleavage. 

rock fabrics on orthogonal faces. a, Largest piece, which includes one of 


© 2018 Springer Nature Limited. All rights reserved. 


LETTER 


3 sa eee: ae Se eee J 


Extended Data Fig. 5 | Petrographic context of samples that were ‘M3M and ‘M3C’ were derived. ‘C’ denotes the subsample from which 
digested for REE + Y analyses. a, Slab sample before cutting. b, Slab analyses ‘C3M’ and ‘C3C’ were derived. Scale bar, 10 mm. 
sample after cutting. ‘M’ denotes the subsample from which analyses 
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MAP 1 
200x200 um 


Approximate ns ae a t es 


scan locations [i (0-2.5x10"cps) K (0-2x10**cps) Cr (0-1.25x1074cps) — Fe (0-2x10"%cps) Ca (0-2.8x10¥cps) = Mn (0-0.8x10%2cps) 


a (not to scale) 


MAP 2 Ti (0-7.5x10°cps) Co (0-4.2x10"cps) Ga (0-1.3x102°cps) Ni (O-1x10"cps) V (0-3.4x102°cps) Zn (0-7x10#°cps) 


MAP 2 
125x125 um 


Ba (0-6.5x10+4cps) K (0-1.8x10"cps) Cr (0-4.1x10°cps) Fe (0-2x10"%cps) Ca (0-1.8x10?cps) Mn (0-1x102cps) 


Ti (0-1x10+cps) Ga (0-1.3x10?°cps) Ni (0-1x10+*cps) Zn (0-5x10?°cps) 
Extended Data Fig. 6 | Synchrotron XRF element maps of the ISB b, c, X-ray intensity variations were used to colour the element maps. Blue, 
sample. The distribution of trace elements relative to minerals is shown. zero X-ray intensity; red, maximum X-ray intensity. X-ray intensity ranges 


a, Photograph of the sample. White squares show map locations. Scale bar, (counts per second (cps)) are shown beneath each map. All maps are for 
10 mm. b, Distribution and X-ray intensity of detected elements for map 1. K-shell X-rays except for Ba, which was detected using L-shell X-rays. 
c, Distribution and X-ray intensity of detected elements for map 2. 
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Extended Data Table 1 | REE + Y concentrations (p.p.m.) from acid digestion and ICP-MS analysis 


LD. La Ce Pr Nd Sm Eu Gd Tb Dy n'a Ho Er Tm Yb Lu 
Carbonate fraction from “Stromatolite” 

C3€ 0.414 0.530 0.070 0.332 0.082 0.040 0.113 0.017 0.113 1.335 0.025 0.072 0.010 0.100 0.011 
Carbonate fraction from “Sediment” 

M3C 1.670 2.080 0.260 1.047 0.203 0.080 0.252 0.034 0.230 3.209 0.053 0.170 0.025 0.190 0.026 
Silicate fraction from “Stromatolite” 

C3M 0.430 0.781 0.124 0.636 0.226 0.343 0.226 0.044 0.421 4.182 0.112 0.310 0.035 0.393 0.050 

Silicate fraction from “Sediment” 
M3M 1.438 2.456 0.341 1.398 0.344 1.358 0.468 0.060 0.528 4.812 0.161 0.485 0.052 0.519 0.096 
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Credibility-enhancing displays promote the 
provision of non-normative public goods 


Gordon T. Kraft-Todd!*, Bryan Bollinger”, Kenneth Gillingham’, Stefan Lamp* & David G. Rand>* 


Promoting the adoption of public goods that are not yet widely 
accepted is particularly challenging. This is because most tools 
for increasing cooperation—such as reputation concerns! and 
information about social norms?—are typically effective only 
for behaviours that are commonly practiced, or at least generally 
agreed upon as being desirable. Here we examine how advocates 
can successfully promote non-normative (that is, rare or unpopular) 
public goods. We do so by applying the cultural evolutionary theory 
of credibility-enhancing displays*, which argues that beliefs are 
spread more effectively by actions than by words alone—because 
actions provide information about the actor’s true beliefs. Based on 
this logic, people who themselves engage in a given behaviour will be 
more effective advocates for that behaviour than people who merely 
extol its virtues—specifically because engaging in a behaviour 
credibly signals a belief in its value. As predicted, a field study of 
a programme that promotes residential solar panel installation in 
58 towns in the United States—comprising 1.4 million residents in 
total—found that community organizers who themselves installed 
through the programme recruited 62.8% more residents to install 
solar panels than community organizers who did not. This effect was 
replicated in three pre-registered randomized survey experiments 
(total n= 1,805). These experiments also support the theoretical 
prediction that this effect is specifically driven by subjects’ beliefs 
about what the community organizer believes about solar panels 
(that is, second-order beliefs), and demonstrate generalizability to 
four other highly non-normative behaviours. Our findings shed 
light on how to spread non-normative prosocial behaviours, offer 
an empirical demonstration of credibility-enhancing displays and 
have substantial implications for practitioners and policy-makers. 

Public goods are crucial to human welfare but pose a challenge when 
contributing is costly to the individual. Field experiments—which ver- 
ify the conclusions of countless models and laboratory experiments— 
have demonstrated the power of reputation concerns and social norms 
for promoting contributions to public goods*. However, such interven- 
tions are typically only effective when most people already contribute 
to the public good in question (a descriptive social norm exists) or at 
least believe that people should contribute to it (an injunctive social 
norm exists)”. 

Here we investigate how to promote public goods that are not 
already normative—that is, how new prosocial norms can be spread. 
We focus on ‘bottom-up’ approaches in which individuals influence 
those around them, rather than ‘top-dowm approaches based on 
institutional sanctions or policies® , and ask why some individuals 
are more successful than others in promoting the adoption of 
new prosocial norms. To shed light on this question, we leverage a 
theory from the study of cultural evolution that has primarily been used 
to explain religious commitment®*: credibility-enhancing displays 
(CREDs)?. 

The essence of this theory is that your actions help to shape my 
second-order beliefs (that is, what I believe about what you believe), 
which in turn influences my adoption of your beliefs. In particular, the 


theory of CREDs focuses on actions that are expected to be beneficial 
to people who hold the belief but expected to be costly to people who 
do not hold the belief. IfI see you engage in such an action, it provides 
a credible signal that you actually hold the belief and thus think the 
action is beneficial—a much stronger signal than if you simply say that 
you believe it. The canonical example involves assessing the edibility of 
a mushroom. If the mushroom is inedible, eating it can be extremely 
costly. Thus, seeing someone eat the mushroom after they say it is edible 
gives you much greater confidence that they truly believe it is safe to eat, 
relative to someone who merely says that the mushroom is edible—and 
this, in turn, makes you more likely to believe the mushroom is edible. 

The logic of CREDs generates a clear prediction regarding bottom-up 
attempts to promote public goods: advocates who themselves engage in 
a given behaviour should be more effective at convincing others to also 
adopt that behaviour—specifically because they are perceived as believ- 
ing that the behaviour is more beneficial. To test these predictions, we 
focus on one particular public-goods problem: the installation of resi- 
dential solar panels. The use of residential solar panels helps to reduce 
carbon dioxide emissions and resultant climate change, and thereby 
benefits society at large. But the immediate financial cost of installa- 
tion, combined with the search cost of learning about solar panels and 
suitable installers, may outweigh any personal benefit to the home- 
owner who chooses to install. The installation of solar panels remains 
descriptively non-normative (only 0.4% of American households 
had solar panels in 2014, during data collection; see Supplementary 
Information section 4.4 for details)? and—as shown by a norming study 
(see Methods and Extended Data Figs. 1, 2 for details) —there is also not 
currently a strong injunctive norm that stipulates that people should 
be installing solar panels. 

We examine the role of CREDs in the ability of community organiz- 
ers to promote the installation of solar panels by using data from a series 
of ‘Solarize Connecticut’ campaigns!®!?, run in 58 towns (with a total 
population size of 1.4 million) in the state of Connecticut by the non- 
profit organization SmartPower from 2012 to 2015. These campaigns 
promoted the Solarize programme, which included a volunteer ‘solar 
ambassador’ in each town who encouraged other residents to install 
solar panels through the programme (see Supplementary Information 
section 4 and Extended Data Fig. 3 for further details). 

Ambassadors were recruited on the basis of their centrality in 
the community social network rather than their own solar installa- 
tion choices. As a result, only a minority of the ambassadors (32.7%) 
themselves installed solar panels through the Solarize programme. 
Examining the number of Solarize installations achieved in each town 
confirms the key prediction of CREDs: more people installed solar 
panels through the Solarize programme in towns in which the ambas- 
sador also installed through Solarize, compared to towns in which the 
ambassador did not install through Solarize (Fig. 1; linear regression 
including controls for the type and timing of the Solarize campaign, 
b=17.89, 95% confidence interval (CI) = 3.36-32.41, P=0.017). This 
result is robust to controlling for important characteristics of the towns 
and the ambassadors: the number of residential solar panel installations 
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Fig. 1 | Ambassadors who install solar panels through the Solarize 
programme are more successful at convincing others to participate than 
ambassadors who do not. The number of people per town who installed 
solar panels using the Solarize programme is shown as a function of 
whether the solar ambassador of that town installed through the Solarize 
programme (blue, n = 18) or did not (red, n = 40). Box-and-whiskers plot 
indicates the minimum, 25th percentile, 50th percentile (median), 75th 
percentile and maximum values. 


in the town before the Solarize campaign, the number of homes suitable 
for solar panel installation in the town, the gender of the ambassador, 
whether the ambassador served in an official town government role 
and whether the ambassador had already installed solar panels before 
the Solarize campaign (see Supplementary Information section 2 for 
further details). 

To help to support a causal interpretation of this correlational find- 
ing, we perform an instrumental variable regression, which is a standard 
econometric technique for inferring causality from observational data 
(for details, see Supplementary Information section 3). We instrument for 
whether the ambassadors installed through the Solarize programme with 
a variable for whether the ambassador's home was suitable for solar instal- 
lation. Given that ambassadors could only install through the Solarize 
programme if their house was suitable, suitability is a useful instru- 
ment: a test of suitability demonstrates that it is not a weak instrument 
(F ratio of 25.23) and it significantly predicts whether the ambassador 
installed using the Solarize programme (b=0.58, 95% CI = 0.34-0.82, 
P<0.001). We believe that suitability is a valid instrument because it is 
highly unlikely that suitability is correlated with potential unobserved 
confounding variables—such as ambassador motivation or installer 
quality—because suitability is based on predetermined features of the 
roof structure and shading of ambassadors’ houses (for further discussion 
of validity, see Supplementary Information section 3). In our instrumented 
regression, we continue to find a significant positive effect of ambassador 
installation on the number of townsperson installations (b = 23.82, 95% 
CI = 1.77-45.88, P=0.034), supporting our causal interpretation. 

Thus, this field study supports our hypothesis based on CREDs: 
ambassadors who installed solar panels through the Solarize pro- 
gramme were more effective at convincing others to perform similar 
installations, when compared to those ambassadors who advocated 
installation without any accompanying action. 

We complement this field study with three pre-registered exper- 
iments that were run using the online labour market Amazon 
Mechanical Turk, which is substantially more demographically diverse 
than undergraduate subject pools’’. Of particular relevance, 52% of 
our subjects indicated that they were past or current homeowners, 
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Fig. 2 | Ambassador installation influences subjects’ intentions to 
install through the Solarize programme. a, b, Means (with 95% CIs) (a) 
and distributions (b) of intentions to install (1-7 Likert scale), as a 
function of whether the ambassador installed solar panels through the 
Solarize programme (blue, n= 100) or did not (red, n= 100). 

c, Subjects’ second-order beliefs fully mediate the effect of ambassador 
installation on subjects’ installation intentions. All variables are 
standardized for this analysis. The correlations between ambassador 
installation and second-order beliefs, second-order beliefs and subjects’ 
intentions to install, and ambassador installation and subjects’ intentions 
to install (without (b) and with (b’) second-order beliefs as a covariate) are 
shown. 


and all of our results replicate when restricting to these subjects 
(see Supplementary Information section 5.2). 

Experiments 1 and 2 recreated the main contrast of the field study: 
subjects were presented with a description of the Solarize campaign, a 
description of a solar ambassador who did or did not choose to install 
solar panels through Solarize and an appeal from the solar ambassador 
that detailed the benefits of the programme. Subjects then indicated 
how likely they would be to install solar panels through the Solarize 
programme. 

As in the field study, experiment 1 (n = 200 individuals) finds a sig- 
nificant effect of ambassador installation: subjects reported a higher 
likelihood of installing through the Solarize programme if the ambas- 
sador installed through Solarize (m= 5.06, 95% CI = 4.80-5.32) than 
if the ambassador did not (m= 3.97, 95% CI = 3.66-4.28, tj9g = 5.31, 
d=0.75, P< 0.001; Fig. 2a, b). Experiment 1 also provides an initial 
test of our prediction that this effect is driven by subjects’ second-order 
beliefs (that is, their beliefs about what the ambassador believes about 
solar panels). To do so, we developed a 12-item second-order-belief 
scale (a= 0.96), in which subjects indicated their beliefs about the 
ambassador’s beliefs about the benefits of the Solarize campaign. As 
predicted, responses to the second-order-belief scale significantly 
and fully mediate the effect of ambassador installation on the sub- 
jects intentions to install (97% of the effect; Fig. 2c) (see Methods and 
Supplementary Information section 5.1 for details). 

Experiment 2 (n = 399 individuals) used an experimental mediation 
design" to provide further support for the key role of second-order 
beliefs in driving the effect of ambassador installation, and to rule out 
two competing explanations. We implemented a2 x 2 between-subjects 
design that crossed the manipulation of whether the ambassador 
installed through the Solarize programme (from experiment 1) with 
a direct manipulation of second-order beliefs regarding the benefits 
of residential solar panels: subjects were informed about accidentally 
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Fig. 3 | Second-order beliefs explain the effect of ambassador 
installation on the installation intentions of subjects. a, Mean (with 95% 
Cl) installation intentions (1-7 Likert scale) as a function of whether the 
ambassador did not install solar panels through the Solarize programme 
and did not believe in the benefits of solar panels (red, left; n = 100), did 
install but did not believe in the benefits (blue, left; n = 98), did not install 
but believed in the benefits (red, right; n = 99), or did install and believed 


overhearing the ambassador express, in confidence, either a positive 
or negative view of residential solar panels. Thus, in experiment 2 sub- 
jects did not need to rely on the installation behaviour of the ambas- 
sador to gain insight into what the ambassador truly believed about 
solar panels—and thus, the CREDs account predicts that ambassador 
installation should have much less effect in experiment 2 compared to 
experiment 1. A two-way ANOVA finds a significant main effect of 
second-order beliefs such that subjects reported a higher likelihood 
of installing when the ambassador expressed a belief that residential 
solar panels are beneficial (m= 5.20, 95% CI = 5.01-5.39), compared 
to when the ambassador expressed a belief that residential solar pan- 
els are not beneficial (m= 2.40, 95% CI = 2.17-2.63, F1,395 = 340.79, 
d=1.83, P< 0.001). There was also a significant, but small, main effect 
of ambassador installation such that subjects reported a slightly higher 
likelihood of installing if the ambassador also installed (m = 4.00, 95% 
CI = 3.71-4.30) than if the ambassador did not (m= 3.60, 95% CI 
= 3.31-3.88, F},395= 6.08, d=0.17, P=0.014) (Fig. 3). There was no sig- 
nificant interaction between ambassador installation and second-order 
beliefs (F\,395 = 2.48, P=0.116). Critically, the coefficient on ambassa- 
dor installation is 62% smaller in experiment 2 than in experiment 1, 
which provides causal evidence that second-order beliefs mediate the 
effect of ambassador installation (subjects were randomly assigned 
simultaneously across experiments 1 and 2 to enable this comparison; 
see Methods and Supplementary Information section 5.1 for details). 

Experiment 2 also provides evidence for CREDs over two alterna- 
tive explanations of the effect of ambassador installation that are based 
directly on actions rather than second-order beliefs. First, subjects 
might dislike or distrust the non-installing ambassador because their 
behaviour is hypocritical’ and therefore ignore their recommendation 
to install through the Solarize programme’®. Second, the ambassador's 
installation decision might directly influence subjects’ count of the 
number of people who install and therefore influence their intention 
to install via perceived descriptive normativity”. Contrary to these 
accounts, however, subjects reported a higher likelihood of installing 
in the condition with an ambassador who did not install—and therefore 
was hypocritical and projected a norm of non-installation—but was 
overheard expressing a belief in the benefits of solar panels (m= 4.89, 
95% CI = 4.59-5.19) than in the condition with an ambassador who 
installed—and was therefore not a hypocrite and projected a norm 
of installation—but was overheard to not truly believe in the benefits 
of solar panels (m= 2.46, 95% CI = 2.14-2.79, ti95 = 10.83, d= 1.54, 
P<0.001). Thus, when put in conflict, information about the beliefs 
of ambassadors overrides their actions. 

Finally, experiment 3 (1 = 1,206 individuals) replicated the design of 
experiment 1; however, solar panel installation was replaced by one of 


4 5 


2 3 4 5 


6 7 
Installation intention 


in the benefits (blue, right; n = 102). b, c, The distributions of installation 
intentions in each condition are shown. We see that when information 
about the ambassador’s beliefs is provided directly in experiment 2, there is 
little effect of whether the ambassador installed on installation intentions 
of the subjects. This result is in contrast to experiment 1, providing 
evidence for the mediating role of second-order beliefs. 


four other behaviours that are strongly non-normative from both 
a descriptive and injunctive perspective (see Methods and Extended 
Data Figs. 2, 3 for details). For each behaviour, we compare subjects’ 
intention to engage in the behaviour across conditions in which 
the ambassador does versus does not engage in the behaviour. 
A random-effects meta-analysis on the four effect sizes reveals a 
significant positive effect of the ambassador engaging in the behaviour 
(d=0.33, 95% CI = 0.21-0.44, Z = 5.63, P<0.001), and no evidence 
of heterogeneity in effect size across behaviours (x? =2.37, P=0.499) 
(Fig. 4). Finally, aggregating over the four behaviours we find that the 
second-order-belief scale (a =0.91) significantly and fully mediates 
the effect of ambassador engagement (89% of the effect) (see Methods, 
Supplementary Information section 5 and Extended Data Fig. 4 for 
details). Thus, experiment 3 shows that the CREDs-based effect docu- 
mented in the field study and experiments 1 and 2 can promote a range 
of highly non-normative public goods other than solar panels. 

These results are of substantial importance for theories of cultural 
evolution, in which CREDs have a major role: despite the influence 
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Fig. 4 | Ambassador engagement promotes contribution to highly 
non-normative public goods. Random-effects meta-analysis of the effect 
of the ambassador engaging in the behaviour they are advocating on 
subjects’ intentions to engage in that behaviour (1-7 Likert scale). This 
was performed across four highly non-normative public goods: purchasing 
carbon offsets for flights (n = 305), only buying used goods (n = 303), 
wearing a facemask in public when sick (n = 297) and replacing grass 
lawns with more sustainable ground cover (n = 301). Effect sizes are shown 
as Cohen's d; error bars indicate 95% CI. The relative sizes of the grey 
boxes indicate the weighting assigned to the studies by the meta-analysis. 
ANOVA produces equivalent results (see Supplementary Information 
section 5.3). 
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that this theory has had, there has previously been little empirical 
evidence that directly supports CREDs—notable exceptions include 
two previous studies”!”— or that specifically demonstrates the mecha- 
nism of second-order beliefs. Our experiments provide such support. 
Furthermore, although it has previously been theorized that CREDs 
may help to explain prosocial behaviour more broadly’’, we apply the 
logic of CREDs to the spread of non-normative public goods in particu- 
lar, which demonstrates an important role for this theory in solving one 
of the major outstanding challenges in cooperation research. Finally, we 
present an experimental methodology that can be used to empirically 
investigate the effects of CREDs in a wide range of contexts beyond 
that of public goods. 

Our results also contribute to the literature on influence, persuasion 
and attitudes'*”°, as well as community organizing and the diffusion of 
solar panels in particular’®, by empirically demonstrating the impor- 
tance of ‘practicing what you preach. Although this result might seem 
obvious in retrospect, the data suggest that it was not in fact self-evident 
in prospect: only 32.7% of solar ambassadors recruited as Solarize 
Connecticut community organizers were people who themselves 
installed residential solar panels through the programme. 

Problems of cooperation and the provision of public goods are 
becoming increasingly important and urgent. The results presented 
here suggest that whether we are advocating for residential solar panels, 
public transportation, supporting local businesses or civil liberties, our 
campaigns will be more effective if they are built on a foundation not 
only of words but also of action. 


Online content 

Any methods, additional references, Nature Research reporting summaries, source 
data, statements of data availability and associated accession codes are available at 
https://doi.org/10.1038/s41586-018-0647-4. 
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METHODS 


No statistical methods were used to predetermine sample size. The field study was 
not randomized, and the experiments were randomized. Investigators were not 
blinded to allocation during experiments and outcome assessment. 

Field study. The Solarize Connecticut campaign was run in five rounds over 
2012-2015 in 58 towns in Connecticut (see Supplementary Information section 4 
for more detail on the rounds and round types). Solarize Connecticut was an 
initiative of the Connecticut Green Bank administered by the non-profit organ- 
ization SmartPower (similar campaigns have also been conducted in New York, 
Massachusetts and Washington”, and are underway in several additional states 
including North Carolina, South Carolina, Pennsylvania, Montana and California). 
There were four main marketing principles of the campaign: town-supported 
outreach and education, pre-selected solar installers, discount pricing through 
a tiered pricing structure and a clear termination date to the campaign. Solarize 
programmes were designed to limit the time and number of approved installers of 
residential photovoltaic systems or solar panels within a town. 

The campaigns were organized by local volunteers—‘solar ambassadors’ —who 
were primarily recruited from town selectpeople, town managers and members 
of the town clean-energy task forces, because they were expected to be key 
nodes in the community social network, and therefore more influential. Before 
the campaigns, solar installation rates were extremely low among the towns 
studied (m< 1%), so it is not surprising that most ambassadors did not 
have solar panels themselves before the campaign. Although many towns 
had multiple ambassadors, there was always a point person for SmartPower, 
who was the ‘primary’ ambassador. In this study, we focus on the one primary 
ambassador (identified by SmartPower, who were blind to the hypotheses tested 
here). In the field study, ambassadors in n= 18 towns installed solar through 
the Solarize programme and ambassadors in n = 40 towns did not. Because the 
ambassadors were recruited based on their centrality in the community social 
network rather than their own solar installation choices, it is not surprising that a 
majority of the ambassadors (67.3%) did not themselves participate in the Solarize 
programme. 

Data from this field experiment originate from three sources: (1) Connecticut 

Green Bank recorded installations and their timing; and after the conclusion of 
the Solarize campaign, solar ambassadors were (2) sent an online survey and 
(3) interviewed in person. Connecticut Green Bank data were used to ascertain 
installations as the dependent variable in the analysis of the field study, and 
ambassador interviews and surveys were used to derive the individual difference 
measures of ambassadors that were the predictors in the field study. We obtained 
county-level data on the suitability of rooftops for solar photovoltaics from the 
company GeoStellar. Yale University’s Institutional Review Boards reviewed the 
use of this data and approved it under protocol 1303011727. 
Experiments. Our experiments were conducted using Qualtrics survey software 
and subjects were recruited using the crowdsourcing tool Amazon Mechanical 
Turk. Informed consent was obtained from all subjects and was approved by Yale 
University’s Institutional Review Boards protocol 1307012383. 

Experiments 1 and 2 were designed to capture the key features of the field study 
in a vignette context (see Supplementary Information section 6 for full experi- 
mental materials for each experiment). The vignettes began on screen 1 with an 
initial description that provided basic information about the Solarize Connecticut 
campaign. Then, screen 2 provided information about a hypothetical solar ambas- 
sador (for maximal similarity to the field study, subjects were only provided with 
information about the ambassador that would have been evident to a community 
member in conversation with an ambassador). At the end of screen 2, subjects in 
experiment 1 and experiment 2 were randomized to receive information that indi- 
cated that the ambassador did or did not install solar panels through the Solarize 
programme. Then, on screen 3 the ambassador gave subjects the ‘pitch’ for Solarize, 
which was copied from the Solarize Connecticut website (http://solarizect.com/ 
about-solarize/). In experiment 2, subjects next saw an additional screen in which 
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they learned about the ambassador’s private beliefs as a result of being behind the 
ambassador in the checkout line at the grocery store and overhearing him speaking 
on the phone to his wife (this screen was not included in experiment 1). Subjects 
in experiment 2 were randomized to either learn that the ambassador's beliefs 
regarding solar panels were positive or negative. On the next screen, subjects in 
both experiments indicated their likelihood of participating in the Solarize pro- 
gramme using a seven-point Likert scale. This question is our dependent variable. 
Subjects in experiment 1 were then shown an additional screen with the second- 
order-belief scale that we developed for this study (a=0.96). The scale consisted 
of twelve items, each of which assessed different aspects of subjects’ second-order 
beliefs regarding Solarize (see Supplementary Information section 6.1.2 for the 
full scale). Finally, subjects in both experiments completed comprehension and 
demographic questions—including whether they were a homeowner—and were 
thanked for their participation. 

In summary, experiment 1 had two conditions (ambassador installed versus 
ambassador did not install) and experiment 2 had four conditions, in a 2 (ambas- 
sador installed versus ambassador did not install) x 2 (ambassador had positive 
beliefs about solar panels versus ambassador had negative beliefs about solar pan- 
els) design. To enable comparison of behaviour across experiment 1 and experi- 
ment 2, subjects for both experiments were recruited at the same time and were 
randomly assigned to one of the six total conditions. Our pre-registered target sam- 
ple size was 100 subjects per condition, and we recruited n= 200 (64.5% female, 
average age = 34.6 years) for experiment 1 and n = 399 (61.1% female, average 
age = 34.6 years) for experiment 2. Subjects completed the survey in 7 min on 
average and were compensated 0.50 US dollars (US$). 

Experiment 3 aimed to test whether the results of the two-condition study of 
experiment 1 generalized to four highly non-normative behaviours. Thus, sub- 
jects were randomly assigned to one of eight between-subjects conditions in a 2 
(ambassador engaged versus ambassador did not engage in the behaviour they were 
advocating) x 4 (behaviour being advocated: purchasing carbon offsets for flights, 
only buying used goods, wearing a facemask in public when sick and replacing 
grass lawns with more sustainable ground cover) design. The format of experiment 
3 was identical to experiment 1, except that all text related to installing solar panels 
though Solarize was replaced with relevant text about one of the four behaviours. 
Full experimental materials are shown in Supplementary Information section 6.4. 
Our pre-registered sample size was n= 150 subjects per condition, and we recruited 
n= 1,206 subjects (63.3% female, average age = 34.8 years), who completed the 
survey in 5 min on average and were compensated US$0.50. 

Subjects in the norming study were asked for their normativity judgments on 
various behaviours associated with contributions to public goods (presented in 
randomized order): wearing a face mask in public when sick with the flu or a 
cough; replacing grass lawns with more sustainable ground cover; buying carbon 
offsets for flights; buying only used consumer goods; installing residential solar 
panels; donating to charity; recycling; and voting in elections. For each behav- 
iour (presented in randomized order), subjects were asked for their judgments of 
injunctive normativity, with the question: ‘in your opinion, how much do people in 
your community think this behaviour is what you are supposed to do?” Responses 
were given on a seven-point scale from 1 (very little) to 7 (very much). The same 
was performed for descriptive normativity, with the question: ‘in your opinion, how 
many people in your community do this behaviour?’ (responses from 1 (very few) 
to 7 (very many)). Our pre-registered sample size was n = 100, and we recruited 
n= 100 subjects (45% female, average age = 36.2 years), who completed the study 
in 4 min on average and were compensated US$0.50. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 
All data are publicly available at: http://osf.io/wbmjc. 
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Extended Data Fig. 1 | Results of the norming study regarding community think doing this behaviour is what you are supposed to do? 
injunctive norms. Distributions are shown of subjects’ (n = 100) Responses were given on a Likert scale between 1 (‘very little’) and 
responses to the question: ‘in your opinion, how much do people in your 7 (‘very much’). 
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Extended Data Fig. 2 | Results of the norming study regarding 
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community do this behaviour?’ Responses were given on a Likert scale 
between 1 (‘very few’) and 7 (‘very many’). 
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Extended Data Fig. 3 | Example photographs from Solarize campaigns. a-c, A live installation event (a), a campaign kick-off event (b) and flyers and 
signs for an informational event (c) are shown. Photographs courtesy of SmartPower. 
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Extended Data Fig. 4 | Mediation analysis for experiment 3. Because The correlations between ambassador engagement and second-order 
there is no evidence of heterogeneity in the effect of CREDs across beliefs, second-order beliefs and subjects’ intentions to engage in the 
non-normative public-good scenarios, we collapse across scenario (total behaviour, and ambassador engagement and subjects’ intentions to engage 
n= 1,206) and see that subjects’ second-order beliefs fully mediate the in the behaviour (without (b) and with (b’) second-order beliefs as a 
effect of ambassador engagement on subject intentions to engage in the covariate) are shown. 


behaviour in question. All variables are standardized for this analysis. 
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m°A facilitates hippocampus-dependent learning 
and memory through YTHDF1 


Hailing Shi!?4!’, Xuliang Zhang>>!”, Yi-Lan Weng”*!”, Zongyang Lu°, Yajing Liu’, Zhike Lu! 
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N®°-methyladenosine (m°A), the most prevalent internal RNA 
modification on mammalian messenger RNAs, regulates the fates 
and functions of modified transcripts through m°A-specific binding 
proteins!~>. In the nervous system, m®A is abundant and modulates 
various neural functions® '’. Whereas m°A marks groups of mRNAs 
for coordinated degradation in various physiological processes!?-!, 
the relevance of m°A for mRNA translation in vivo remains largely 
unknown. Here we show that, through its binding protein YTHDF1, 
m°A promotes protein translation of target transcripts in response 
to neuronal stimuli in the adult mouse hippocampus, thereby 
facilitating learning and memory. Mice with genetic deletion of 
Ythdf1 show learning and memory defects as well as impaired 
hippocampal synaptic transmission and long-term potentiation. 
Re-expression of YTHDF1 in the hippocampus of adult Ythdf1- 
knockout mice rescues the behavioural and synaptic defects, whereas 
hippocampus-specific acute knockdown of Ythdf1 or Mettl3, which 
encodes the catalytic component of the m°A methyltransferase 
complex, recapitulates the hippocampal deficiency. Transcriptome- 
wide mapping of YTHDF1-binding sites and m°A sites on 
hippocampal mRNAs identified key neuronal genes. Nascent protein 
labelling and tether reporter assays in hippocampal neurons showed 
that YTHDF1 enhances protein synthesis in a neuronal-stimulus- 
dependent manner. In summary, YTHDF1 facilitates translation 
of m°A-methylated neuronal mRNAs in response to neuronal 
stimulation, and this process contributes to learning and memory. 

The discovery of reversible RNA methylation and the develop- 
ment of transcriptome-wide mapping methods of m°A have sparked 
extensive research into the functions of m°A methylation in diverse 
biological processes'*!*, with several studies showing that m®°A gov- 
erns mRNA stability during cell fate transition and animal develop- 
ment!!2-5. In addition, m°A affects mRNA translation. In HeLa cells, 
the m°A binding protein YTHDF1 facilitates the initiation of trans- 
lation of m°A-modified mRNAs’. In MCE7 cells, m®°A and YTHDF1 
appear to have more complex effects on translation”. Therefore, it 
remains to be discovered how m°A modulates mRNA translation and 
to what extent this function affects physiological events in intact bio- 
logical systems. 

Previous studies have suggested that m®°A modulates neuronal func- 
tions, including dopaminergic signalling in the mouse midbrain*, flight 
and locomotor behaviours in flies’, neurogenesis in adult mice’, and 
axon regeneration in mice’. Upregulation of m®A has been observed to 
occur with brain maturation!’, behavioural experience!”, and memory 


formation’, suggesting a link between m°A accumulation and brain 
activity. Learning and memory are fundamental functions of brains, 
and long-term memory formation is believed to require activity- 
induced protein synthesis”!. We therefore investigated whether learning 
and memory processes could be affected by the translational effects of 
m®°A and YTHDF1. 

In the mouse brain, Ythdfl mRNA is preferentially expressed in the 
hippocampus”, a key region in spatial learning and memory. We con- 
structed Ythdfl CRISPR-Cas9 knockout (Ythdfl-KO) mice” (Extended 
Data Fig. la—d), in which complete elimination of YTHDF1 protein was 
verified in the hippocampus and other brain regions (Fig. 1a; Extended 
Data Fig. le, f). Compared to wild-type control littermates, Ythdfl-KO 
mice developed normally up to four months of age (the end of the 
experiment) and appeared to have normal gross hippocampal histology 
(Fig. 1b), adult hippocampal neurogenesis (Extended Data Fig. 2a, b), 
and cortical morphology (Extended Data Fig. 2c, d). Moreover, loss of 
YTHDF1 did not alter the motor abilities or general emotional states 
of the mice (Extended Data Fig. 2e-m). 

We first examined hippocampus-dependent spatial learning and 
memory in these mice using Morris water maze (MWM) tests”* 
(Extended Data Fig. 3a; see Methods). In visible platform training, 
Ythdfl-KO mice performed as proficiently as wild-type mice (Fig. 1c), 
indicating that they could see and acquire procedural learning nor- 
mally. However, in hidden platform training, Ythdfl-KO mice spent 
a longer time navigating to the platform than control mice (Fig. 1d), 
suggesting that their spatial learning was impaired. In the probe test, 
Ythdfl-KO mice failed to remember the previous platform location, 
spending a similar time in each quadrant despite showing normal 
swimming activity (Fig. le; Extended Data Fig. 3b, c), suggesting 
defects in spatial memory. 

To further confirm the importance of YTHDF1 in hippocampus- 
dependent learning and memory, we performed classical fear conditioning 
tests (Extended Data Fig. 3d; see Methods). Contextual fear memory 
is sensitive to hippocampal defects, whereas auditory fear memory 
depends on the amygdala”. We titrated the electric shock intensity to 
establish fear conditioning protocols that did not saturate fear responses 
(Extended Data Fig. 3e). Under a moderate training protocol (0.5 mA, 
2s, 1 pair), Ythdfl-KO mice showed a smaller freezing response dur- 
ing inter-trial intervals (ITIs) but not when the tone sounded (Fig. 1f; 
Extended Data Fig. 3f), suggesting that contextual but not audi- 
tory learning was impaired. Twenty-four hours after conditioning, 
Ythdf1-KO mice showed deficits in contextual but not auditory fear 
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Fig. 1 | Impaired spatial learning and memory in Ythdf1-KO mice. 

a, b, Representative images of YTHDF1 immunostaining (a) and Hoechst 
staining (b) in control and Ythdfl-KO hippocampus. DG, dentate gyrus; 
P30 and P120, postnatal days 30 and 120. c, d, Learning curves of control 
(red) and Ythdf1-KO (blue) mice in MWM tests with visible (c) and 
hidden (d) platform. e, Quadrant time (per cent; left) and representative 
swimming paths (right) of control and Ythdfl-KO mice in the MWM 
probe test. The red dashed line represents the chance level (25%). 


memory (Fig. 1g; Extended Data Fig. 3g). Under a weaker training 
protocol (0.5 mA, 1 s, 1 pair), contextual but not auditory fear memory 
of the mice was impaired two hours after training (Fig. 1h; Extended 
Data Fig. 3h). Together, these data support the idea that genetic dele- 
tion of Ythdf1 disrupts learning and memory formation in the mouse 
hippocampus. 

We next used electrophysiological characterization to study how 
YTHDF1 depletion affects hippocampal synaptic functions. We used 
whole-cell patch-clamp to examine the basal synaptic properties of hip- 
pocampal CA1 neurons (see Methods). In Ythdf1-KO CA1 neurons, 
spontaneous miniature excitatory postsynaptic currents (mEPSCs) 
were substantially decreased in amplitude and frequency, compared 
to control neurons (Fig. 2a, b). Analysis of paired-pulse ratios (PPRs) 
also indicated reduced presynaptic release probability in Ythdfl-KO 
CAI neurons (Extended Data Fig. 4a, b), confirming the defects in 
basal synaptic transmission. Morphologically, Ythdfl-KO CA1 neurons 
had reduced dendritic spine density but unaltered spine size (Extended 
Data Fig. 4c, d). 

Long-term potentiation (LTP) is an important cellular model for 
explaining learning and memory. To test whether YTHDF1 modulates 
long-term synaptic plasticity, we recorded LTP induced by high- 
frequency stimulation (HFS) in the CA1 region of hippocampal slices. 
Compared to wild-type controls, Ythdf1-KO slices did not generate 
normal levels of field excitatory postsynaptic potential (fEPSP) after 
two rounds of HFS induction (Fig. 2c, d). Initial potentiation following 
HFS was similar between control and Ythdf1-KO slices (Fig. 2c); there- 
fore, it is less likely that the decreased LTP was due only to impairments 
in basal synaptic transmission. Ythdfl-KO slices were also defective 
in late-phase LTP (induced by four rounds of HFS; Fig. 2e, f), which 
requires activity-induced synaptic protein synthesis. Indeed, depletion 
of YTHDF1 noticeably reduced the abundance of key proteins involved 
in LTP in the postsynaptic density (PSD) fraction of hippocampal neu- 
rons (Fig. 2g, h; Extended Data Fig. 4e), although such decreases were 
not observed for those proteins in whole hippocampal tissue (Extended 
Data Fig. 4f). Together, these results suggest that depletion of YTHDF1 
in mice impairs basal synaptic transmission and LTP in hippocampal 
neurons, contributing to the observed defects in learning and memory. 

To confirm that the observed defects resulted from the loss of 
YTHODF1 specifically in the hippocampus, we investigated whether 
re-expressing YTHDF1 in the hippocampus of adult Ythdf1-KO 
mice would be sufficient to rescue the phenotypes. We delivered an 
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f, Learning curves of control (red) and Ythdf1-KO mice (blue) for 
contextual fear conditioning in moderate (left) or strong (right) training 
sessions. Base, baseline; ITI, inter-trial interval. g, h, Contextual fear 
memory assessed 24 h (g) or 2 h (h) after the indicated fear conditioning. 
P values, two-way ANOVA with two tailed t-test (relative to “Target’ or 
between genotypes) (e), two-way repeated measures ANOVA with post 
hoc test (c, d, f), and two-tailed t-test (g, h). Numbers in bars, numbers of 
mice. Data shown as mean + s.e.m. 
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Fig. 2 | Deficient basal transmission and plasticity in Ythdfl-KO 
hippocampal synapses. a, b, Representative traces (a) and quantification 
of amplitude (b, left) and frequency (b, right) of spontaneous mEPSCs in 
control and Ythdf1-KO hippocampal CA1 neurons. c, d, Summary plots (c) 
and average amplitude (d) of LTP induced by 2 x HFS in the CA1 region 
of control and Ythdf1-KO acute slices. e, f, Summary plots (e) and average 
amplitude (f) of late-phase LTP induced by 4 x HFS. Top, sample traces 
taken at time points 1 and 2 indicated above the summary plots; scale 

bars, 10 ms (horizontal) and 0.2 mV (vertical) (c, e). g, h, Representative 
western blots (g) and quantification (h) of LTP-related proteins in control 
and Ythdf1-KO hippocampal postsynaptic density (PSD) fractions. P values, 
Kolmogorov—Smirnov test for cumulative distributions followed by 
comparisons with Mann-Whitney U test (b) and two-tailed t-test (d, f, h). 
Numbers in bars, numbers of neurons/mice (b), slices/mice (d, f), and 
mice (h). Data shown as mean +s.e.m. 
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Fig. 3 | Selective YTHDF1 re-expression in the hippocampus rescues 
defects in memory and synaptic plasticity. a, Schematics of AAV 
constructs overexpressing YTHDF1 (AAV-YTHDF1) or control (AAV- 
control). ITR, inverted terminal repeats; CMV, cytomegalovirus promoter; 
WPRE, woodchuck hepatitis virus posttranscriptional regulatory element. 
b, Illustration of bilateral viral injections into the mouse hippocampus. 
Mouse brain reproduced with permission from the atlas of Paxinos 

and Franklin”’. c, Representative fluorescence images of the mouse 
hippocampus after AAV infection. Hoechst, blue; YTHDF1 co-expressed 
with mCherry, red. d-g, Learning curves in MWM hidden-platform 
training (d), quadrant time (per cent) in MWM probe tests (e), and 
contextual (f) and auditory (g) fear memories assessed 24 h after fear 
conditioning, of Ythdfl-KO mice injected with AAV-control (red) or 
AAV-YTHDF1 (blue), compared to uninjected wild-type (WT, green). 
h-j, Representative traces (h), summary plots (i), and average amplitude 
(j) of LTP induced by 2 x HFS in acute slices from Ythdfl-KO mice 
injected with AAV-YTHDF1 or AAV-control. Sample traces (h) were taken 
at time points 1 and 2 indicated in the summary plots (i). P values, 
two-way repeated measures ANOVA with post hoc two-tailed t-test 
(horizontal P values, AAV-YTHDF1 relative to AAV-control; vertical 

P values, comparisons between curves) (d), two-way ANOVA with post 
hoc two-tailed t-test (comparison within group or with “Target’) (e), 
one-way ANOVA with post hoc Fisher test (f, g), and two-tailed t-test (j). 
Numbers in bars, numbers of mice (e-g) and slices/mice (j). Data shown 
as mean + s.e.m. 


adeno-associated virus (AAV) expressing either YTHDF1 (AAV- 
YTHDF1) ora control fluorescent protein mCherry (AAV-control) 
specifically to the hippocampus by bilateral stereotactic injection 
(Fig. 3a, b), resulting in selective re-expression in injected regions 
(Fig. 3c; Extended Data Fig. 5a—c). Hippocampal re-expression of 
YTHDF1 in Ythdf1-KO mice substantially enhanced their learning 
and memory performances in MWM tests (Fig. 3d, e; Extended Data 
Fig. 5d) and restored contextual fear memory to normal levels (Fig. 3f), 
with no obvious effect on auditory fear memory, anxiety-like behaviour, 
or motor activity (Fig. 3g; Extended Data Fig. 5e-h); it also reversed the 
hippocampal LTP deficiency in Ythdfl-KO mice (Fig. 3h-)). 

To test whether acute loss of YTHDF1 in the hippocampus was sufficient 
to induce the phenotypes of Ythdfl-KO mice, we injected an AAV 
expressing a short hairpin RNA specifically targeting Ythdf1 transcripts 
(AAV-RNAi) to the hippocampus of adult wild-type mice (Extended 
Data Fig. 6a, b). In mice injected with AAV-RNAi, learning and mem- 
ory in MWM tests were markedly impaired (Extended Data Fig. 6c-f), 
as was contextual fear memory but not emotional state or auditory fear 
memory (Extended Data Fig. 6g-i). Moreover, hippocampus-specific 
knockdown of Mettl3 (Extended Data Fig. 7a) also phenocopied the 
effects of YTHDF1 depletion, leading to defects in spatial memory and 
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contextual fear memory without affecting auditory fear memory or 
locomotor activity (Extended Data Fig. 7b-f). These knockdown results 
further support the idea that the observed phenotypes in Ythdfl-KO 
mice come from direct depletion of YTHDF1 instead of potential devel- 
opmental defects caused by lack of YTHDF1 and are m®A-dependent. 

We next proceeded to investigate the underlying molecular mech- 
anisms of these effects. We mapped YTHDF1 binding sites and m°A 
sites on hippocampal mRNAs using crosslinking and immunopre- 
cipitation-based sequencing methods (CLIP-seq; Supplementary 
Table 1). Biological triplicates of YTHDF1 CLIP-seq identified 3,552 
common peaks as high-confidence peaks (Extended Data Fig. 8a) on 
1,042 transcripts (defined as “YTHDF1-CLIP targets’; Supplementary 
Table 2), with validated pull-down efficiency of YTHDF1 (Extended 
Data Fig. 8b). About two-thirds of the high-confidence YTHDF1-CLIP 
peaks were mapped to mature mRNAs (Fig. 4a) and enriched near the 
stop codon and 3/UTR (Fig. 4b). Functional annotation of YTHDF1- 
CLIP targets revealed substantial enrichment for synaptic transmission 
and LTP (Fig. 4c; Supplementary Table 2), consistent with the neuronal 
phenotypes observed in Ythdf1l-KO mice. 

Biological triplicates of m°A-CLIP-seq, using purified hippocam- 
pal poly(A)* RNA, yielded about 11,000 common peaks with the 
GGACU consensus sequence and enrichment at the coding sequence 
and 3/UTR (Extended Data Fig. 8c-e) on 3,460 transcripts (defined 
as ‘m°A-modified transcripts’; Supplementary Tables 1, 3). Similarly, 
genes that mediate neuronal biological processes were overrepresented 
in m°A-modified transcripts (Extended Data Fig. 8f; Supplementary 
Table 3). At the transcript level, YTHDF1-CLIP targets on average 
have higher numbers of m°A-CLIP peaks and crosslinking-induced 
mutations detected in the m®A-CLIP-seq data, either compared to tran- 
scripts without YTHDF1-CLIP peaks (defined as ‘non- YTHDF1-CLIP 
transcripts’), or compared to m°A-modified transcripts (Fig. 4d); at the 
peak level, 30% of YTHDF1-CLIP peaks overlap (>1 nucleotide) with 
m°A-CLIP peaks (Extended Data Fig. 8g; in comparison, 0.65-0.72% 
of background peaks overlap with m°A-CLIP peaks; see Supplementary 
Table 3, Methods). These results indicate that YTHDF1 preferentially 
recognizes m°A sites in the adult mouse hippocampus. Key synaptic 
plasticity transcripts, including Grial, Grin1, and Camk2a, contain 
one or more overlapped YTHDF1-CLIP peaks and m°A-CLIP peaks 
(Extended Data Fig. 8h). We then profiled mRNA and protein abun- 
dance in the hippocampus of Ythdf1-KO and control mice. Note that 
YTHDFI-CLIP targets and m°A-modified transcripts exhibit a slight 
decrease in mRNA abundance (Extended Data Fig. 8i; Supplementary 
Tables 1, 4) and no observable changes in global protein level (Extended 
Data Fig. 8j; Supplementary Table 5) in the hippocampus of Ythdf1-KO 
mice compared to control mice. 

These mild changes prompted us to investigate whether YTHDF1 
functions in a neuronal-stimulus-dependent manner. To test this, 
we monitored nascent protein synthesis”* in cultured wild-type and 
Ythdf1-KO hippocampal neurons before and after potassium chloride 
depolarization (KCI, 50 mM). KCI stimulation induced noticeable pro- 
tein synthesis in control neurons but much less in Ythdf1-KO neurons 
(Fig. 4e, f; Extended Data Fig. 9a-c). We found the same difference 
between AAV-mediated control and YTHDF1-knockdown neurons 
(Extended Data Fig. 9d, e). We also constructed a reporter system in 
which the N terminus of YTHDF1 (YTHDF1-N) was tethered to the 3’ 
UTR of the firefly luciferase (F-Luc) coding sequence, mimicking direct 
binding of YTHDF1 onto m°A-modified transcripts; Renilla luciferase 
was co-transfected for normalization? (Fig. 4g). YTHDF1-N tethering 
did not affect F-Luc protein level before KC] stimulation (Fig. 4h, left), 
consistent with proteomics results (Extended Data Fig. 8j). However, 
increased F-Luc production was observed 2 and 4h after KCl stimulus 
for YTHDF1-N tethering compared to the control (Fig. 4h), support- 
ing the idea that YTHDF1 promotes protein synthesis upon neuronal 
stimulation. Indeed, Bsn, one of the top YTHDF1-CLIP targets, showed 
attenuated protein expression after fear conditioning in the Ythdfl-KO 
hippocampus, as did Camk2a in the Ythdf1-KO PSD fraction (Extended 
Data Fig. 10a). After fear conditioning, YTHDF1 protein increased by 
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Fig. 4 | YTHDF1 facilitates translation of m°A-modified targets in 
response to neuronal stimulation. a, b, Distributions of high-confidence 
YTHDF1-CLIP peaks in different regions of genome (a) and transcripts 
(b). c, Functional annotation of YTHDF1-CLIP targets (n = 1,032) in 

the adult mouse hippocampus. d, Box-plots of the number of m®A-CLIP 
peaks (left) and the log, number of m®A-CLIP-seq mutations (right) on 
m°A-modified transcripts, non-Y THDF1-CLIP transcripts, and YTHDF1- 
CLIP targets. e, f, Representative images (e) and quantification (f) of 
nascent protein (Nascent-P) synthesis in cultured control and Ythdfl-KO 
hippocampal neurons before (sham) and 2 h after KCI depolarization. 
Nascent-P signals were normalized to that of control neurons under sham 
condition. g, Schematics of a tether reporter system that mimics binding 
between YTHDF1 and 3’ UTR m°A sites of target transcripts. YTHDF1-N, 
truncated N-terminal mouse YTHDF1 (amino acids 1-389); F-Luc/R- 
Luc, firefly/Renilla luciferase. h, Normalized F-Luc reporter expression 


30% in the PSD fraction, although no change occurred at the tissue level 
(Extended Data Fig. 10b, c). This suggests that YTHDF1 may undergo 
translocation to the PSD in response to stimulation, which could con- 
tribute to localized translation in synapses and thus synaptic plasticity. 

We also examined potential changes in the m°A landscape in the 
dentate gyrus in an electroconvulsive treatment (ECT) model, in 
which dentate granule cells are synchronously activated’””*, The 
m°A RNA immunoprecipitation (RIP)-seq of dentate gyrus mRNAs 
(Extended Data Fig. 10d; see Methods) showed that although the level 
of YTHDF1-CLIP targets was not differentially regulated compared 
to other transcripts in response to ECT, the m°A-methylated copies 
of YTHDFI1-CLIP targets were upregulated in abundance after ECT 
(Fig. 4i; Supplementary Tables 1, 6), suggesting that an increase in bind- 
ing of YTHDF1 to its m°A-methylated targets may occur upon stim- 
ulation to both facilitate translation and help to stabilize these targets 
(Supplementary Discussion). 

In summary, we show that m°A methylation of mRNAs facilitates 
learning and memory formation in the mouse hippocampus, mainly 
by promoting translation of target transcripts upon neuronal stimula- 
tion, and that this effect is mediated through the m®A-binding protein 
YTHDF1. Depletion of YTHDF1 impairs basal transmission and LTP 
at hippocampal synapses. The presence of YTHDF1 could expedite 
new protein synthesis that is required for long-lasting changes in syn- 
aptic plasticity and thus memory formation; in the hippocampus of 
Ythdf1-KO mice, stimulus-dependent protein synthesis is attenuated, 
resulting in less efficient synaptic strengthening and a lower probability 
of reaching thresholds for memory formation (Extended Data Fig. 10e). 
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in cultured hippocampal neurons tethered with YTHDF1-N or control 
(\), before (sham) and after KCl depolarization. i, Box-plots of transcript 
abundance log) fold change between electroconvulsive treated (ECT) 

and untreated (Mock) dentate gyrus, for m°A-modified transcripts, m°A- 
modified non-YTHDF1-CLIP transcripts, and transcripts with overlapped 
YTHDF1-CLIP and m°A-CLIP peaks, in ‘Input’ (left) and m®A-enriched 
‘RIP’ (right) RNA-seq libraries. Dashed lines, median log, fold change 

of all reliably detected transcripts (reads per kilobase of transcript per 
million mapped reads (rpkm) >1). Box-plot elements: centre line, median; 
box limits, upper and lower quartiles; whiskers, 1-99%; error bars, 95% 
confidence interval of mean; number in parentheses, number of genes 

(d, i). P values, two-sided unpaired Kolmogorov-Smirnov test (d, i) and 
two-tailed t-test (f, h). Numbers in bars, numbers of images/mice (f) and 
biologically independent samples (h). Bar plots show mean + s.e.m. (f, h). 


Promotion of translation by m°A could be stimulation-induced, as 
shown here for YTHDF1, and this might represent an important aspect 
of RNA methylation-dependent translational regulation'. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. The 
sample sizes in this manuscript were similar to previous papers*"°. Experimenters 
were blind to the genotype and treatment for all behavioural tests. 

Animals. All mice were maintained under a 12-12-h light-dark cycle with lights 
on at 07:00, and temperature and humidity were kept at 22 + 1°C, 55% +5%, with 
ad libitum access to food and water. Male adult (8-16 weeks of age) mice were used 
for behavioural tests. Animal experiments, except for electroconvulsive treatment 
(ECT), were carried out in accordance with protocols approved by the Institutional 
Animal Care and Use Committee of the School of Life Science and Technology 
of Shanghaitech University and with the Guidance Suggestions for the Care and 
Use of Laboratory Animals, formulated by Ministry of Science & Technology of 
the People’s Republic of China. Animal procedures used in ECT were performed 
in accordance with protocols approved by the Institutional Animal Care and Use 
Committee of Johns Hopkins University School of Medicine and University of 
Pennsylvania School of Medicine. 

Cell lines. The N2A cell line used in in vitro transfection experiments was pur- 
chased from Cell Bank of Chinese Academy of Sciences and authenticated by the 
supplier. It is not in the list of commonly misidentified cell lines maintained by 
the International Cell Line Authentication Committee (ICLAC). Cells were tested 
negative for mycoplasma contamination before use. 

Generation of Ythdfl-KO mice. The YTH domain family protein-1 knock- 
out mice (Ythdfl-KO) were generated using CRISPR-Cas9. sgRNA expression 
plasmids were generated by annealing and cloning oligos that were designed to 
target exon 4 of Ythdf1 into the Bsal sites of pUC57-sgRNA (Addgene 51132). 
mYTHDF1-E4-1 T7 gRNA up: TAGGATAGTAACTGGACAGGTA; 
mYTHDF1-E4-1 gRNA down: AAACTACCTGTCCAGTTACTAT; mY THDF1-E4-2 
T7 gRNA up: TAGGCACCATGGTCCACTGCAG; mYTHDF1-E4-2 gRNA down: 
AAACCTGCAGTGGACCATGGTG, 

In vitro transcription and microinjection of CRISPR-Cas9 were performed as pre- 
viously described”. In brief, the Cas9 expression construct pST1374-Cas9-N-NLS- 
Flag-linker-D10A (Addgene 51130) was linearized with Agel and transcribed using 
the mMACHINE 17 Ultra Kit (Ambion, AM1345). Cas9 mRNA was purified using 
RNeasy Mini Kit (Qiagen, 74104). pUC57-sgRNA expression vectors were linearized 
by Dral and transcribed using the MEGAshortscript Kit (Ambion, AM1354). sgR- 
NAs were purified by MEGAclear Kit (Ambion, AM1908). A mixture of Cas? mRNA 
(20 ng/l) and two sgRNAs (5 ng/il each) was injected into cytoplasm and male 
pronucleus of zygotes obtained by mating of CBF1. Injected zygotes were transferred 
into pseudo-pregnant CD1 female mice. Founder mice used for experiments were 
backcrossed to C57BL/6 for at least five generations. Ythdfl-KO mice used for the 
experiments were killed at 8-16 weeks of age and did not show obvious development 
defects before this point. mY THDF1-E4 C9 For: CACCTGAGTTCAGATCATTAC; 
mYTHDF1-E4 C9 Rev: GCTCCAGACTGTTCATCC. Amplicon length: 650 bp. 
Applicable to genotyping founders and targeted ESC. 

Genotyping. Mice were weaned at the third postnatal week and genotyped 
by PCR. Ythdf1-KO and wild-type alleles were detected by PCR assays in 
which primer F1 (5’- GTGTATGAGGTGGTCAGCAT-3’) and primer R1 (5/- 
CTTGTTGAGGGAGTCACTGT-3’) amplified a 465-bp fragment (wild-type) 
and a 286-bp fragment (Ythdf1-KO) (Extended Data Fig. 1d). 

Open-field test. Mice were exposed to a square open arena (40 cm x 40 cm) with 
opaque base and walls (40 cm high). Each mouse was allowed 30 min to explore 
the area and its activity was recorded and analysed using the Tru Scan Activity 
System (Coulbourn Instruments). The surface was cleaned with 70% ethanol after 
each mouse was tested. 

Elevated-plus maze. The elevated-plus maze apparatus consists of two open arms 
(50 cm x 9 cm), two enclosed arms (50 cm x 9 cm x 39 cm) and a central area (9 
cm x 9 cm). The maze is elevated 70 cm above ground in a room with normal light. 
Mice were placed in the central area individually and allowed 5 min to explore the 
maze. The time each mouse spent in the open arms during the 5-min exploration 
was counted by Anymaze software. 

Light-dark box transition test. The light and the dark compartments of the 
light-dark transition box (35 cm x 35 cm x 40 cm) were separated by an opaque 
plexiglass board with a hole. The light compartment was illuminated by strong 
light (400 lx). During the test, mice were individually placed at the centre of the 
light compartment facing away from the hole and allowed 30 min to explore freely 
in the box. The activity of each mice was monitored. The time mice spent in the 
light compartment as well as the number of transitions between the two com- 
partments were automatically calculated by Tru Scan Activity System (Coulbourn 
Instruments). 

Tail-suspension test. The tail-suspension test was used to assess behavioural 
despair of mice. Each mouse was suspended by its tail with adhesive tape for 6 
min and was video recorded. Total immobility time during the test was scored 
by independent observers. Mice were considered immobile only when they hung 
passively and motionlessly for at least 2 s. 


Morris water maze task. The MWM test was specifically designed to evaluate 
spatial reference memory abilities*4 (Extended Data Fig. 3a). The Morris water- 
tank consists of a circular pool (diameter 120 cm, height 50 cm) filled with water 
maintained at room temperature (23 + 1°C) and made opaque with nontoxic white 
paint. The pool is located in an experimental room with many extra-maze visual 
cues and virtually divided into four equal quadrants. A circular platform, 10 cm 
in diameter, is placed in the middle of one fixed quadrant (‘target’) of the pool, 
just above water surface (visible platform) or 1 cm underneath the water surface 
(hidden platform). For visible platform training, mice were trained for four trials 
with 30 min inter-trial intervals each day for two consecutive days, and they were 
released from each starting point in a random order. For hidden platform training, 
mice were trained for four trials each day for five consecutive days. Twenty-four 
hours after the last trial of training (day 6), the platform was removed and all mice 
were given one probe trial for 60 s searching (probe test). The escape latency to 
visible or hidden platform and the exploring time in each quadrant of the pool 
were automatically recorded by water maze system (Coulbourn, Inc.). Mice were 
trained at the same time of a day during their light phase. 

Contextual and auditory fear conditioning. The fear conditioning test was per- 
formed as previously described*! (Extended Data Fig. 3d). Mice were first handled 
for 5 min each day for three consecutive days and habituated to the conditioning 
chamber for 5 min the day before training. On the day of training, after 3 min 
exploration in the conditioning chamber, each mouse received one pairing of a tone 
(2,800 Hz, 75 dB, 30 s) with a short co-terminating foot shock (0.5 mA, 1 s) for the 
weak training protocol, a long foot shock (0.5 mA, 2 s) for the moderate protocol, 
or three pairings of a tone (2,800 Hz, 75 dB, 30 s) with a long co-terminating foot 
shock (0.5 mA, 2 s) for the strong protocol, after which they remained in the cham- 
ber for additional 30 s and were then returned to home cages. Two hours and 24 
h after the conditioning, mice were tested for freezing (behavioural immobility) 
in response to the training context (training chamber) and to the tone (in the 
training chamber with a new environment and odour). The percentage of freezing 
time was calculated as an index of fear learning and memory. For contextual fear 
memory tests, mice were returned to the conditioning chamber for 3 min and 
freezing behaviour was counted using StartFear Combined system (Panlab). For 
auditory fear memory tests, mice were placed in a changed chamber and freezing 
responses were recorded during the last 3 min when the tone was delivered. Tests 
of contextual and auditory fear memory were done in a counterbalanced manner. 
Electrophysiological recording of hippocampal slices. Extracellular field record- 
ings and whole-cell miniature excitatory postsynaptic current (mEPSC) recordings 
in the hippocampal CA1 region were conducted in 380-|1m-thick acute brain slices 
from 6-9-week-old wild-type control and Ythdf1-KO mice of either sex. Coronal 
sections that contained hippocampal formations were prepared according as pre- 
viously described*. In brief, mice were anaesthetized with sodium pentobarbital 
and were killed by decapitation. Transverse slices of the hippocampus (380 jum) 
were cut using the vibratome at 4°C in a modified artificial cerebrospinal fluid 
(mACSF) consisting of 110 mM choline chloride, 2.5 mM KCl, 0.5 mM CaCh, 
7mM MgSOg, 25 mM NaHCOs, 1.25 mM NaH2POg, 25 mM p-glucose, and 3.1 
mM sodium pyruvate, which was saturated with 95% O2 and 5% CO). Slices were 
transferred to an incubating chamber with oxygenated (95% O2 and 5% COz) nor- 
mal ACSF containing 120 mM NaCl, 2.5 mM KCl, 2.5 mM CaCly, 1.3 mM MgSOg, 
26 mM NaHCOs, 1 mM NaH>PO,, 10 mM p-glucose (pH 7.3-7.4) and incubated 
at 30°C for at least 2 h before recording. Data were collected with a MultiClamp 
700B (Molecular Devices), digitized using Digidata 1440A and pClamp 10.1 data 
acquisition system (Molecular Devices). Frequency, duration, and magnitude 
of extracellular stimulus were controlled with a Master 8 pulse stimulator (A-M 
Systems). Evoked synaptic responses were triggered with a bipolar electrode. 
LTP and PPF. To record the extracellular field excitatory postsynaptic potentials 
(fEPSPs), a glass micro-electrode (4-8 MQ, filled with 0.5 M sodium acetate) 
was placed in the stratum radiatum of the CA1 region, and a bipolar tungsten 
stimulating electrode was placed along the Schaffer collateral fibres 100-150 jum 
away from the recording pipette. The intensity of the stimulation was adjusted to 
produce an fEPSP with an amplitude of 30-40% of the maximum response. Test 
stimulation was delivered once per 30 s (0.033 Hz) or per minute (0.017 Hz). After 
recording a stable baseline for at least 30 min, early-phase LTP or later-phase LTP 
was induced by two (100 Hz for 1 s, 30 s interval) or four trains (100 Hz for 1 s,5 
min interval) of HFS, respectively. Magnitudes of LTP and later-phase LTP were 
calculated based on the averaged fEPSP values during the last 10 min and 30 min 
of summary plots, respectively. 

For paired-pulse facilitation (PPF) recording, a second stimulus was delivered 
following the first one with different time intervals. The two stimuli were sepa- 
rated by 20, 40, 60, 80, 100, 200, 400, 600, 800, or 1,000 ms. The amplitude of the 
population response to the second stimulus was compared with that to the first 
one to obtain the PPF ratio. 

Miniature EPSCs. Voltage clamp recordings were obtained from neurons 
in hippocampus slices equilibrated for at least 1 h in the recording chamber. 
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Micropipettes (tip diameter: 1.5-2.0 jm; resistance: 4-6 MQ) were pulled from 
borosilicate capillaries (P-97; Sutter Instruments) and filled with an internal solu- 
tion (adjusted to pH 7.2) composed of 145 mM potassium gluconate, 5 mM NaCl, 
1mM MgCh, 0.2 mM EGTA, 10 mM HEPES, 2 mM magnesium ATP, 0.1 mM 
sodium-guanosine-5’-triphosphate, and 10 mM phosphocreatine disodium. For 
mEPSC recordings, 0.5 1M tetrodotoxin was added to the perfusion solution. 
Picrotoxin (100 |.M) was present in all of experiments to block \-aminobutyric 
acid (GABA) type A receptor—-mediated inhibitory synaptic currents. Neuronal 
signals, which were amplified using an Axoclamp-700B amplifier (bandwidth filter 
set at 1 kHz for voltage clamp recordings), were digitized (Digidata 1440A, pClamp 
10.1; Molecular Devices). The series resistance was 12-20 MQ and was monitored 
throughout the experiment. Data were discarded when access resistance changed 
by more than 15% during the experiment. 

Plasmid constructs and viruses. For reporter assay, pPB-CAG-Flag- 
YTHDF1-N-) and pPB-CAG-Flag-) were constructed by inserting the 
Flag-Y THDF1-N-) (YTHDFI1-N: N terminus of mouse YTHDF1, 1-389 
aa) and Flag-» fragments into pPB-CAG backbone vector between BglII and 
Xhol restriction sites, respectively. For AAV vectors, pAAV-CMV-mouse- 
YTHDF1-2a-mCherry-WPRE, pAAV-CMV-mCherry-WPRE, pAAV-CMV- 
RFP-U6-YTHDF1-shRNA, pAAV-CAG-eGFP-H1-YTHDF1-shRNA, and 
pAAV-CAG-eGFP-H1-METTL3-shRNA were all designed and constructed by 
standard methods. The following oligonucleotide sequences were used for knock- 
down: mouse YTHDF1-shRNA: 5/-GATCCTTACCTGTCCAGTTAC-3’; mouse 
METTL3-shRNA: 5’-GCACACTGATGAATCTTTAGG-3’; Scramble control: 
5/-AACAGTCGCGTTTGCGACTGG-3’. AAV viruses were prepared by Taitool 
Biotech (Shanghai). 

In vivo stereotactic injections. For viral injection, male mice (8-10 weeks of age) 
were anaesthetized with 5% chloral hydrate (100 11/10 g body weight) by intra- 
peritoneal (i.p.) injection and placed on a stereotaxic apparatus. Small bilateral holes 
were drilled into the skull at -1.7 mm posterior and -1.5 mm lateral to bregma for 
injections into the hippocampal CA1 and dentate gyrus (DG) regions. A glass can- 
nula filled with a virus solution was lowered to CA1 (-1.5 mm) and DG (-2.0 mm), 
and the virus solution (0.6 jl) was injected using a Nanoject II (Drummond) sys- 
tem at a rate of 0.1 jl per min sequentially into each side of the hippocampus. The 
injection cannula was slowly withdrawn 5 min after the virus infusion. The scalp 
was then sealed and injected mice were monitored as they recovered from anaes- 
thesia. Behavioural experiments or electrophysiological recordings were performed 
at least 10 days after virus injection. Virus infection was examined at the end of 
all the behavioural tests. 

Immunohistochemistry. Ythdfl-KO and wild-type male mice (from P28 to 16 
weeks old) were perfused with phosphate-buffered saline (PBS) followed by 4% 
paraformaldehyde in PBS. After post-fixation in 4% PFA for 12 h at 4°C and 
dehydration in 30% sucrose-PBS solution for another 24 h, the brains were fro- 
zen-sectioned into coronal slices (35 j1m) for next step use. For anti-YTHDF1 
and anti-DCX staining, slices were incubated in diluted antibody solution at 4°C 
overnight then detected by Alexa Fluor-conjugated second antibodies. All slices 
were counterstained with Hoechst in the final step incubation. Fluorescent image 
acquisition was performed using an Axioimager Z2 microscope or LSM 510 con- 
focal microscope (Zeiss). Images were analysed with Image-Pro Plus and ImageJ 
software. Brain slices from mice injected with AAV-CMV-mouse-YTHDF1- 
2a-mCherry-WPRE (AAV-YTHDF1) and AAV-CMV-mCherry-WPRE (AAV- 
control) were used for the YTHDF1 overexpression quantification assay (Extended 
Data Fig. 5a, b). For Extended Data Fig. 2d, female mice (P60) were anaesthetized 
and perfused with ice-cold 4% paraformaldehyde (PFA). Brains were removed 
from perfused animals, post-fixed overnight in 4% PFA in PBS, and cryoprotected 
in 30% sucrose (w/v) for 2-3 days. Samples were sectioned on a microtome at 
40 jum thickness. Primary antibody was applied at 4°C overnight. Secondary anti- 
body was applied for 2 h at room temperature. 

Western blot. Samples were homogenized in RIPA buffer (Beyotime) containing 
1 mM PMSF, 1 x protease inhibitor cocktail and 1 x phosphatase inhibitor cock- 
tail (Sigma). Lysates were boiled at 100°C with 6 x loading buffer (Beyotime) for 
8 min and then stored at -80°C for next step use. A total of 30 xg protein per 
sample was resolved on SDS-PAGE (10%) at 80 V for 20 min and then 110 V for 
110 min. Proteins on the gel were transferred onto PVDF membranes (Millipore) 
and blocked in 5% milk blocking solution for 1 h at room temperature, incubated 
ina diluted primary antibody solution at 4°C overnight, and incubated in a dilution 
of secondary antibody conjugated to HRP for 2 h at room temperature (dilution 
folds indicated in the section of antibodies). Protein bands were detected using 
ECL western blotting detection reagents (Millipore) and Amersham Imager 600 
system (GE). 

Dissociated neuron culture and reporter assay. Hippocampal neurons from E18 
C57BL/6 mouse embryos of either sex were cultured at a density of 200,000 cells per 
well on poly-p-lysine pre-coated 6-well plates. Neuron cultures were maintained in 
complete medium (neurobasal medium supplemented with 0.5 mM GlutaMAX-I 
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and 2% B-27). Plasmid transfection was conducted using 4D-Nucleofector System 
(Lonza) immediately after neuron dissociation. 

The reporter plasmid (pmirGlo-5BoxB) and the effecter plasmid (Flag-), or 
Flag-Y THDF1-N-) in pPB-CAG vector) were used to transfect neuron cultures 
at a ratio of 1:9 as previously reported”. After transfection, neurons were plated 
in plating medium (neurobasal medium supplied with 0.5 mM GlutaMAX-I, 2% 
B-27 and 5% FBS) for 6 h, then changed to complete medium for further culturing. 
Three days after transfection, neurons were treated with KC] at the final concen- 
tration of 50 mM for 2, 4, or 8 h. Then neurons were collected and assayed using 
Dual-Glo Luciferase Assay Systems (Promega) to test protein production. 
Quantitative mass spectrometry. Hippocampal samples from wild-type control 
and Ythdfl-KO mice (8-12 weeks, male) were freshly isolated and snap frozen in 
liquid nitrogen. For each mouse, 100 mg tissue was used for further preparation. 
Tissues were ground into fine powder in liquid nitrogen then lysed with 500 \1l 
freshly prepared lysis buffer (20 mM triethylammonium bicarbonate (TEAB, pH 
8.5), 8 M urea, protease inhibitor cocktail (Roche), and 1 mM DTT). The yielded 
lysate was treated with ultrasonication at 4°C for 30 s to shear DNA, followed by 
centrifugation at 16,000g for 10 min at 4°C. The resultant supernatant was carefully 
separated and transferred into a new tube. Protein concentrations were measured 
using BCA Protein Assay Kit (Thermo Scientific). For each condition, a total of 100 
lg protein was reduced with 5 mM tris(2-carboxyethyl)phosphine (TCEP) for 3 h 
at 30°C, then alkylated with 10 mM methyl methanethiosulfonate (MMTS) for 45 
min at room temperature (protected from light). Samples were then diluted with 
20 mM TEAB to obtain a final concentration of 1 M urea before digestion with 
2.5 |g trypsin overnight at 37°C. Resultant tryptic peptides were finally labelled 
with TMT 10plex Mass Tag Labelling Kit (Thermo Scientific) according to man- 
ufacturer’s protocol and followed with liquid chromatography with tandem mass 
spectrometry (LC—MS/MS) analysis. 

PSD preparation. Hippocampal tissues from wild-type and Ythdf1l-KO mice (8- 
12 weeks, male) were isolated, snap frozen in liquid nitrogen, and stored at -80 °C 
before use. PSD fraction preparation was prepared as previously described*; hip- 
pocampal tissues were homogenized in homogenization buffer (320 mM sucrose, 
5 mM sodium pyrophosphate, 1 mM EDTA, 10 mM HEPES (pH 7.4), 1 x protease 
inhibitor cocktail, and 1 x phosphatase inhibitor cocktail (Sigma)). The result- 
ant homogenate was centrifuged at 800g for 10 min at 4°C to yield post-nuclear 
pelleted fraction 1 (P1) and supernatant fraction 1 (S1). S1 was further centrifuged 
at 15,000g for 20 min at 4°C. Then pellet P2 (which contains the synaptosome) was 
resuspended in 4 mM HEPES (pH 7.4) and incubated with agitation at 4°C for 
30 min. Suspended P2 was centrifuged at 25,000g for 20 min at 4°C. The resulting 
pellet was resuspended in 50 mM HEPES (pH 7.4), mixed with an equal volume of 
1% Triton X-100, and incubated with agitation at 4°C for 15 min. The PSD fraction 
was generated by centrifugation at 32,000g for 20 min at 4°C. The final PSD pellet 
was resuspended in 50 mM HEPES followed by protein quantification and then 
boiled with 6 x loading buffer for western blot. 

Lucifer yellow labelling by intracellular injection. Wild-type control and 
Ythdf1-KO mice (8-12 weeks, male) were perfused with 4% PFA in PBS and their 
brains were removed to perform intracellular injection of the fluorescent dye 
Lucifer yellow (LY). The brains were post-fixed for 24 h in 4% PFA in PBS, and 
coronal sections were obtained (200 1m). Intracellular injections were performed 
as previously described’. In brief, sections were placed under a differential inter- 
ference contract (DIC) microscope to find healthy CA1 neurons and a continuous 
current (5-10 nA) was used to inject cells with LY. At least five CA1 pyramidal cells 
per mouse were injected individually with LY, with the current applied until the 
distal tips of each dendrite fluoresced brightly (5-10 min). Images (z-stacks) for 
spine density counting were acquired using an LSM 510 confocal microscope with 
a 63x oil objective. Spine counting and spine morphology analyses were performed 
using Neurostudio software. 

Protein synthesis assay. Wild-type control and Ythdf1-KO mouse hippocampal 
neurons were cultured on pre-coated glass cover slides. Twelve days later, a protein 
synthesis assay was conducted using Click-iT Plus OPP Alexa Fluor 488 Protein 
Synthesis Assay Kit (Invitrogen, C10456) following the manufacturer's protocol. 
In brief, the neurons were treated with 50 mM KCI for 10 min before the complete 
culture medium was changed back. Click-iT OPP (Component A) was diluted 
1:1,000 in pre-warmed culture medium as a 20 |1M working solution. Two or four 
hours after the KC] treatment, the culture medium was replaced with the working 
solution for another 30-min incubation under culturing conditions. The medium 
was then removed, and the neurons were washed once with PBS before being 
fixed with 4% PFA for 15 min and permeabilized with 0.5% Triton X-100 (in PBS) 
for another 15 min at room temperature. After two more rounds of wash with 
PBS, the neurons were incubated with a freshly prepared Click reaction cocktail 
for 30 min at room temperature in the dark and rinsed once with the reaction 
rinse buffer. Finally, neurons were counterstained with NuclearMask Blue Staining 
working solution and washed twice with PBS before imaging and analysis. For 
assays using AAV-mediated knockdown (Extended Data Fig. 9d, e), wild-type 
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mouse hippocampal neurons were cultured on pre-coated glass cover slides, and 
in three days transfected with AAV-YTHDF1-RNAi-GFP (pAAV-CAG-eGFP-H1- 
YTHDF1-shRNA) or AAV-control-GFP. Four to five days after virus transfection, a 
protein synthesis assay was conducted as described above using Click-iT Plus OPP 
Alexa Fluor 594 Protein Synthesis Assay Kit (Invitrogen, C10457). All the pictures 
were acquired using identical settings and analysed using Image-Pro Plus. The 
Integrated Optical density (IOD) of fluorescent signals was divided by area value 
of Hoechst signals for each picture to derive the signal intensity of each neuron. For 
comparison, fluorescence intensities from experimental groups were normalized 
to that from control neurons under sham conditions. 

YTHDF1-CLIP-seq. Fifteen micrograms of rabbit anti-YTHDF1 antibody 
(Proteintech, 17479-1-AP), 50 tl Protein A beads (ThermoFisher), 50 tl Protein 
G beads (ThermoFisher), and four pairs of hippocampi from C57BL/6 mice (9 
weeks, male) were used for each biological replicate. Three biological replicates 
were performed. 

For each replicate, Protein A and G beads were washed three times with PBST 
(PBS with 0.05% Tween-20) and resuspended with anti-YTHDF1 antibody in 250 
ul PBST before overnight rotation at 4°C. On the following day, mouse hippocam- 
pal tissues were dissected, homogenized in 500 jl HBSS buffer, and crosslinked 
in 6-well plates on ice four times with 254 nm UV light, 0.15 J/cm? per time (UV 
stratalinker 2400, Stratagene). Lysis buffer (1.5 ml; 150 mM NaCl, 0.5% NP-40, 
50 mM Tris-HCl (pH 7.5), 2 mM EDTA, 1% protease inhibitor cocktail (Roche), 
0.5 mM DTT) was added to the crosslinked tissue pellet for a 40-min rotation at 
4°C. After being cleared by maximum-speed centrifuge, the lysate underwent a 
first round of RNA digestion with 0.2 U/\il RNase T1 (ThermoFisher, EN0541) 
for 15 min at room temperature followed by a five-min quenching on ice. A 100- 
ul aliquot of the resultant lysate was saved as ‘Input, and the remaining lysate was 
incubated with the Protein A and G beads conjugated with anti- YTHDF1 antibody. 
After three hours of rotation at 4°C, the beads were washed three times with 1 ml 
immunoprecipitation (IP) wash buffer (50 mM Tris-HCl (pH 7.5), 300 mM KCl, 
0.05% NP-40, 0.5 mM DTT, 1% protease inhibitor cocktail), and resuspended in 
200 jl IP wash buffer supplemented with 10 U/l RNase T1 for a second round 
of RNA digestion for 8 min at room temperature. The previously saved ‘Input’ 
was digested in parallel, and immediately supplemented with 4 x Laemmli sam- 
ple buffer (Bio-Rad). After a 5-min quenching on ice, the beads were washed 
three times with high-salt wash buffer (50 mM Tris-HCl (pH 7.5), 500 mM NaCl, 
0.05% NP-40, 0.5 mM DTT, 1% protease inhibitor cocktail, 1% SUPERase In) and 
another three washes with PNK buffer (50 mM Tris-HCl (pH 7.5), 50 mM NaCl, 
10 mM MgCl). The RNA fragments co-immunoprecipitated with Protein A and 
G beads (‘CLIP’) were subject to end-repair by: (1) 1 U/l T4 PNK (ThermoFisher, 
EK0031) in 100 jl 1 x PNK buffer A (ThermoFisher) at 37 °C for 20 min with vig- 
orous shaking; and then (2) 1 mM ATP (final concentration) with another 0.5 U/ 
pl T4 PNK at 37°C for another 20 min. The beads were washed with PNK buffer 
another five times and then resuspended in 100 jl 2 x Laemmli sample buffer. 
The YTHDF1-RNA complex was size-selected by SDS-PAGE (size indicated in 
Extended Data Fig. 8b), and the gel slice at the same molecular weight was cut for 
‘Input’ samples in parallel. To extract RNA, the gel slices were mashed and digested 
with 2 mg/ml protease K (ThermoFisher, RNA-grade, 25530049) at 55°C for 1 
h. Then gel particles were filtered out, and the RNA was purified by acid-phe- 
nol:chloroform extraction and overnight ethanol precipitation. The Input RNA 
fragments were end-repaired using T4 PNK and further cleaned up using RNA 
Clean & Concentrator-5 (Zymo Research). RNA libraries were generated using 
NEBNext multiplex small RNA library preparation kit (NEB, E7300S) for both 
Input and CLIP samples. 
m°A-CLIP-seq. Total RNA was extracted from hippocampal tissue dissected 
from C57BL/6 mice (8-16 weeks, male) using Trizol (Invitrogen) and isopro- 
panol precipitation. Poly(A)* RNA was purified using Dynabeads mRNA DIRECT 
Purification Kit (Invitrogen) following the manufacturer’s instructions. 

For the m®°A-CLIP-seq, we followed the protocol reported!” with a smaller 
amount of starting material: 300 ng purified poly(A)* RNA, with 2.5 jig anti-m°A 
antibody (Synaptic System, 202 003) and 25 11 Protein A/G beads. Three biolog- 
ical replicates were performed, and the pair of hippocampi from one mouse were 
pooled for each replicate. 

RNA-seq. Total RNA from wild-type littermate control and Ythdfl-KO mouse 
(8-16 weeks, male) hippocampus was extracted using Trizol (Invitrogen) and 
isopropanol precipitation. mRNA extraction was performed by poly(A)+ RNA 
selection once using Dynabeads mRNA DIRECT Purification Kit (Invitrogen). The 
RNA libraries were prepared using Truseq stranded mRNA sample preparation 
kit (Illumina) according to the manufacturer's protocol. Three biological replicates 
were performed for each genotype, and two hippocampi from one mouse were 
pooled for each replicate. 

ECT and m°A-RIP-seq of dentate gyrus. Adult male, 6-8-week old C57BL/6 mice 
were used (Charles River) and housed in a standard facility. ECT was achieved with 
pulses consisting of 1.0 s, 100 Hz, 16-18 mA stimulus of 0.3 ms delivered using the 


Ugo Basile ECT unit (Model 57800) as previously described**. Mock-treated mice 
were handled in parallel without the electrical current delivery. 

m°A-RIP-seq. Total RNA from adult mouse dentate gyrus was isolated using TRIzol 
reagent according to the manufacturer's instructions (Invitrogen). mRNA purifi- 
cation was performed with poly(A)* RNA selection twice using Dynabeads Oligo 
(dT)25 (Thermo Fisher; 61006). A total of 150 ng of mRNA was subjected to 
m°A-SMART-seq using anti-m®A rabbit polyclonal antibody (Synaptic Systems, 
202003) as previously described’. In brief, 5 1g of anti-m°A polyclonal antibody 
was conjugated to Dynabeads Protein A (Thermo Fisher; 10001D) and used 
for each affinity pull-down. The m°A RNA was eluted twice with 6.7 mM N®- 
methyladenosine (Sigma-Aldrich; M2780) in 1 x IP buffer (10 mM Tris-HCl (pH 
7.5), 150 mM NaCl, and 0.1% (vol/vol) Igepal CA-630) and recovered by RNA 
Clean and Concentrator-5 (Zymo Research). Libraries were generated using the 
SMART-seq protocol as described**. Three biological replicates for each condition 
were sequenced using Illumina NextSeq 550 from a single end for 75 bases. 

Data analysis of high-throughput sequencing data. General processing. (for all 
sequencing samples unless specified). Sequencing was carried out on NextSeq500 
with single end 80-bp read length or NextSeq550 with single end 75-bp accord- 
ing to the manufacturer’s instructions. Sequencing data were mapped to mouse 
genome version mm10 downloaded from UCSC using Tophat v2.0.14°°. For 
RNA-seq analysis, rpkm were calculated by Cuffnorm*’. For CLIP-seq experi- 
ments, after removing the adaptor using Cutadapt**, the reads were aligned to 
the mouse genome (mm10) by Bowtie 2°”. More information could be found in 
Supplementary Table 1. 

Peak calling in YTHDF1-CLIP-seq. All mapped reads were treated as background 
and mutations were treated as signals for peak calling. PARalyzer*® was use for 
peak calling in CLIP-seq samples with a few modifications: (1) mutations in both 
CLIP and Input were removed from CLIP; (2) sites with 100% mutations rate were 
also removed. The remaining mutations were used for peak calling. At least two 
mutation sites are need in each peak (MINIMUM_CONVERSION_LOCATIONS_ 
FOR_CLUSTER=2). 

Peak calling in m°A-CLIP-seq. We followed the same peak calling method as 
reported’. The consensus motif was determined using HOMER" for the m°A- 
CLIP peaks identified in each replicate. 

m°A-RIP-seq analysis. Low-quality bases and adaptor sequences from original reads 
were removed using Trimmomatic’”. The remaining reads were then mapped to 
the mouse genome (mm10) using STAR aligner*’. Mapped reads between samples 
were normalized using DESeq2“*. The Input and RIP libraries were normalized. 
Integrative analysis. (1) Definitions for groups of transcripts: (i) The common peaks 
(peaks from replicate 1 with >1 nt overlap in peak location with those from both 
replicate 2 and 3) of CLIP-seq are defined as high-confidence CLIP peaks. (ii) 
Transcripts with high-confidence YTHDF1-CLIP peaks are defined as YTHDF1- 
CLIP targets (Fig. 4c, d, i; Extended Data Fig. 8i, j). (iii) Transcripts without 
YTHDF1-CLIP peaks in any of the three YTHDF1-CLIP-seq replicates are defined 
as non-YTHDF1-CLIP targets, and they were used as a control group for analys- 
ing gene expression change in the absence of YTHDF1 (Fig. 4d, i; Extended Data 
Fig. 8i, j). (iv) Transcripts with high-confidence m®A-CLIP peaks are defined as 
m°A-modified transcripts (Fig. 4d, i; Extended Data Fig. 8f, h, i, j). (v) Transcripts 
with overlapped high-confidence YTHDF1-CLIP peaks and high-confidence m°A- 
CLIP peaks (>1 nt in peak location) are defined as YTHDF1-CLIP + m°A-CLIP 
transcripts (Fig. 4i; Extended Data Fig. 8i, j). (2) Functional annotation of a list 
of genes was generated by DAVID*™®, for YTHDF1-CLIP targets (Fig. 4c) and 
m°A-modified transcripts with no fewer than five mutations in m°A-CLIP-seq 
(Extended Data Fig. 8f). (3) Only genes with sufficient expression (rpkm >1 in 
RNA-seq of wild-type triplicates; rpkm >1 in m°A-RIP-seq input or RIP libraries) 
were kept and subject to further analyses. The median rpkm value of the sequencing 
triplicates was used for differential analyses. 

Antibodies. The antibodies used in this study are listed below in the format 
of name (application; catalogue; supplier; dilution fold): rabbit anti-YTHDF1 
(western blot, 17479-1-AP, Proteintech, 500-1,000; IF, 200); rabbit anti-YTHDF2 
(western blot, 24744-1-AP, Proteintech, 500); rabbit anti- YTHDF3 (western blot, 
25537-1-AP, Proteintech, 500); rabbit anti-YTHDC1 (western blot, 14392-1-AP, 
Proteintech, 500); rabbit anti- YTHDC2 (western blot, ab176846, Abcam, 1,000); 
rabbit anti- METTL3 (western blot, ab195352, Abcam, 1,000); mouse anti-GAPDH 
(western blot, G8795, Sigma, 3,000); rabbit anti- DCX (IF, ab18723, Abcam, 1,000); 
mouse anti-ACTIN (western blot, 44700, Sigma, 1,000); rabbit anti-GRIA1 (west- 
ern blot, AB1504, Merck, 1,000); rabbit anti-CAMK2 (western blot, 4436S, Cell 
Signaling, 1,000); rabbit anti-GRIN1 (western blot, 5704S, Cell Signaling, 1,000); 
rabbit anti-GRIN2A (western blot, 4205S, Cell Signaling, 1,000); mouse anti- 
BSN (western blot, ab82958, Abcam, 500); goat anti-mouse IgG HRP-conjugated 
(western blot, AP308P, Merck, 5,000); goat anti-rabbit IgG HRP-conjugated 
(western blot, AP307P, Merck, 5,000); rat anti-CTIP2 (IF, ab18465, Abcam, 
300); mouse anti-SATB2 (IF, ab51502, Abcam, 300); Alexa Fluor 488 goat anti- 
mouse IgG (IF, A11029, ThermoFisher, 1,000); Cy3 AffiniPure donkey anti-rat 
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IgG (H+L) (IF, 712-165-153, Jackson ImmunoResearch, 500); Cy5 AffiniPure 
donkey anti-mouse IgG (H+L) (IE, 715-175-150, Jackson ImmunoResearch, 
500); biotin-SP-conjugated goat anti-rabbit IgG (IF, 111-065-003, Jackson 
ImmunoResearch, 500); Cy2-conjugated streptavidin (IF, 016-220-084, Jackson 
ImmunoResearch, 1,000). 

Reporting summary. Further information on experimental design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 

High-throughput sequencing data can be accessed in the Gene Expression 
Omnibus under accession number GSE106607. Source data for bar graphs and 
box-plots in Figures and Extended Data Figures are provided in separate excel files. 
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Extended Data Fig. 1 | Generation and evaluation of Ythdfl-KO mice. sequenced. Founder #4 with a 179-bp deletion was crossed with C57BL/6 
a, Schematic diagram of the targeting strategy for generating Ythdfl-KO wild-type mice for further analysis. d, Representative genotyping PCR 
mice using CRISPR-Cas9. Two sgRNAs (red) were designed to target products of offspring mice with different genotypes. e, Validation of 
the fourth exon (E4) of Ythdfl. PAM sequence, underlined, green; F1 Ythdf1 knockout by western blot using mouse hippocampal tissues. For 
and R1, genotyping primers. b, Genotyping PCR products of the seven gel source data, see Supplementary Fig. 1. f, Representative images of 
founders co-injected with 20 ng Cas9 mRNA and the two sgRNAs (5 ng YTHDF1 immunostaining in the mouse basal lateral amygdala (BLA) and 
each). c, Genotypes of sequenced mice. PCR products were cloned and the cortex. 
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Extended Data Fig. 2 | Ythdfl-KO mice are normal in hippocampal 
neurogenesis, cortical morphology, motor activities, anxiety-like 
behaviour, and depressive-like behaviour. a, b, Representative images of 
Doublecortin (DCX, a marker of neurogenesis) immunostaining (a) and 
quantification of the number of DCXt cells (b) in the dentate gyrus (DG) 
region of Ythdf1-KO and wild-type control mice at different postnatal 
development stages. Scale bar, 100 jm. c, Representative images of cortical 
morphology staining using Hoechst in adult control and Ythdf1-KO mice. 
Scale bar, 200 jum. d, Representative confocal immunostaining of CTIP2 
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(a marker for deep layer cortical neurons) and SATB2 (a marker for upper 
layer cortical neurons) in the cortex of adult control and Ythdf1-KO mice. 
Scale bar, 200 jm. e-h, Motor activities measured by various parameters 
as listed in the open-field test. i-l, Anxiety-like behaviour measured by the 
light-dark box transition test (i, j) and the elevated-plus maze test (k, 1). 
m, Depressive-like behaviour measured by tail suspension test. 

P values, two tailed t-test. Numbers in bars, numbers of mice. Data shown 
as mean + s.e.m. 
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Extended Data Fig. 3 | MWM tests and fear conditioning tests in 
Ythdf1-KO mice. a, Schematics of procedure of MWM training and 
MWM probe tests. b, c, Number of crossings over previous platform 
location (b) and swimming velocity (c) of control and Ythdf1-KO mice in 
MWM probe tests. d, Schematics of the fear conditioning procedures (left) 
and freezing responses measured at different stages (right). e, Titration 
curves of the freezing level of wild-type mice 24 h after training with 
different foot shock intensities. The conditioning protocols used in later 
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curves for auditory fear conditioning under moderate (left) or strong 
(right) training protocols. The training sessions were separated into two 
parts: baseline (base) and tone periods (tone). g, h, Auditory fear memory 
of control and Ythdf1-KO mice assessed 24 h (g) and 2 h (h) after the 
indicated training sessions. P values, two-way repeated measures ANOVA 
(d) and two tailed t-test (b, c, f-h). Numbers in bars, numbers of mice. 
Data shown as mean + $.e.m. 
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Extended Data Fig. 4 | PPRs, spine morphology, and total protein 

levels of various LTP-related genes in Ythdf1-KO mouse hippocampus. 
Related to Fig. 2. a, b, PPR with different inter-stimulus intervals in CA1 
neurons from wild-type control and Ythdf1-KO mice. c, d, Representative 
images of Lucifer yellow staining (c) and statistical analyses of spine 
density (d, left) and spine size (d, right) in CA1 neurons from adult control 
and Ythdf1-KO brains. e, Uncropped western blot images for Fig. 2g. 


f, Total protein levels of a set of LTP-related genes in control and 
Ythdfl-KO mouse hippocampus. For gel source data, see Supplementary 
Fig. 1. P values, two-way repeated measures ANOVA with post hoc two- 
tailed t-test (a) and two tailed t-test (b, d, f). Numbers in bars, numbers of 
slices (b), neurons/mice (d, left), spines (d, right), or mice (f). Data shown 
as mean + s.e.m. 
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Extended Data Fig. 5 | Viral targeting in Ythdf1l-KO mouse 


hippocampus and behavioural analyses of Ythdf1-KO mice injected 
with AAV virus. Related to Fig. 3. a, Representative fluorescence images 
of brain slices from rostral to caudal positions dissected from a mouse 
injected with AAV-YTHDF1 virus. Hoechst, blue; YTHDF1 co-expressed 
with mCherry, red. b, Representative images of virus expression (mCherry, 
red) and YTHDF1 immunostaining (green) in the mouse hippocampus 
after AAV-control or AAV-YTHDF1 infection. Hoechst, blue. c, YTHDF1 
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previous platform location for Ythdfl-KO mice injected with AAV- 
YTHDF1 or AAV-control in MWM probe tests. e, Anxiety-like behaviour 
of the injected mice measured as open-arm durations in elevated-plus 
maze. f-h, Motor activities of the injected mice measured as total distance 
(f), number of moves (g), and average velocity (h) in the open-field test. 

P values, two tailed t-test (c-h). Numbers in bars, numbers of mice. Data 
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Extended Data Fig. 6 | Impaired spatial learning and memory after 
selective knockdown of YTHDF1 in the hippocampus of wild-type 
mice. a, Schematics of the AAV construct expressing YTHDF1 shRNA. 

b, Western blot and quantification of protein expression level of YTH 
proteins in N2A cells after YTHDF1-shRNA (RNAi) or control vector 
(Ctrl) transfection. For gel source data, see Supplementary Fig. 1. c, Spatial 
learning curves in the hidden-platform MWM training sessions for 

RNAi (red) and control (grey) mice. d-f, Spatial memory performances 
measured by quadrant time (%) (d) and number of platform crossings (e), 


and motor activities (f) of RNAi (red) and control (grey) mice in MWM 
probe tests. g, i, Contextual (g) and auditory (i) fear memories assessed 
24h after fear conditioning in RNAi and control mice. h, Anxiety level 
of mice assessed by open-arm durations in elevated-plus maze. P values, 
two-way repeated measures ANOVA with post hoc two-tailed t-test (c), 
two-way ANOVA with two-tailed t-test (comparison between group or 
to “Target’) (d), and two tailed t-test (b, e-i). Numbers in bars, numbers 
of biologically independent samples (b) and mice (d-i). Data shown as 
mean = s.e.m. 
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Extended Data Fig. 7 | Impaired spatial learning and memory after 
acute knockdown of METTL3 in the hippocampus of wild-type mice. 
a, Representative western blot (left) and quantification (right) of METTL3 
protein level in N2A cells transfected with METTL3-shRNA (RNAi) 

or control vector (Ctrl). For gel source data, see Supplementary Fig. 1. 
b-d, Spatial learning curves in the hidden-platform MWM training 
sessions (b), and spatial memory performance measured by quadrant 
time (per cent) (c) and the number of platform crossings (d) in MWM 
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probe tests, for METTL3-RNAi and control mice. e, Contextual (left) and 
auditory (right) fear memories measured by freezing levels 24 h after fear 
conditioning in METTL3-RNAi and control mice. f, Motor activities of 
mice accessed in the open-field test. P values, two-way repeated measures 
ANOVA with post hoc two-tailed t-test (b), two-way ANOVA with two- 
tailed t-test (comparison between groups or to “Target’) (c), and two tailed 
t-test (a, d-f), Numbers in bars, numbers of biologically independent 
samples (a) and mice (c-f). Data shown as mean + s.e.m. 
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Extended Data Fig. 8 | YTHDF1 binding sites and m°A sites in the 
hippocampus of adult mice, and YTHDF1-mediated effects of m°A 
on hippocampal transcriptome and proteome. a, Peak overlap among 
three biological replicates of YTHDF1-CLIP-seq. b, Validation of 
immunoprecipitation efficiency for YTHDF1-CLIP-seq. The position of 
the gel slice cut during the step of protein-RNA complex size selection 
is indicated in red (see Methods). c, Consensus motif and its P value 
generated by HOMER” of the three sets of hippocampal m°A sites from 
biological replicates of m°A-CLIP-seq. d, e, Distribution of m°A-CLIP 
peaks along the different regions of transcripts (d) and genome (e). 

f, Functional annotation of m°A-modified transcripts in the adult mouse 
hippocampus (number of mutations in m°A-CLIP-seq > 5, n = 2,922). 
g, Peak overlap between high-confidence YTHDF1-CLIP peaks and 


YTHDF1-CLIP + m®A-CLIP 
m°A-modified 


high-confidence m°A-CLIP peaks. The percentage of YTHDF1-CLIP 
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peaks overlapped is indicated. h, Integrative Genomics Viewer (IGV) 
screenshots of the piled mutated reads for the each of the biological 
triplicates of YTHDF1-CLIP-seq (red) and m°A-CLIP-seq (blue). Three 
examples of synaptic plasticity transcripts were presented; the overlapped 
peak regions are highlighted in orange. i, j, Box-plots of mRNA abundance 
(i) and protein abundance (j) log» fold changes (A) between Ythdf1- 

KO hippocampus and wild-type control for all expressed genes (black), 
non-YTHDF1-CLIP transcripts (grey), YTHDF1-CLIP targets (red), 
transcripts with overlapped YTHDF1-CLIP peaks and m®A-CLIP peaks 
(pink), and m°A-modified transcripts (blue). Box-plot elements: centre 
line, median; box limits, upper and lower quartiles, whiskers, 1-99%; P 
values, two-sided unpaired Kolmogorov-Smirnov test; number of genes 
and 95% confidence interval of mean are indicated for each box (i, j). 
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Extended Data Fig. 9 | Effects of YTHDF1 on nascent protein synthesis 
in cultured hippocampal neurons in response to KCI stimulus. 

a, Additional representative images of nascent protein (Nascent-P) 
synthesis in cultured wild-type control and Ythdf1-KO hippocampal 
neurons before (sham) and 2 h after KCI depolarization, related to Fig. 4e, 
f. b, c, Representative images (b) and quantification (c) of Nascent-P in 
wild-type control and Ythdfl-KO hippocampal neurons before (sham) 
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and 4 h after KCI depolarization. d, e, Representative images (d) and 
quantification (e) of Nascent-P in AAV-control and AAV-YTHDF1- 
RNAi treated hippocampal neurons before (sham), 2 h, and 4h after KCl 
depolarization. Intensities of Nascent-P were normalized to that of wild- 
type control (c) or AAV-control (e) neurons under the sham condition. 
P values, two-tailed t-test (c, e). Numbers in bars, numbers of images/ 
biologically independent samples. Data shown as mean + s.e.m. 
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class of skeletal stem cells 
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Skeletal stem cells regulate bone growth and homeostasis by 
generating diverse cell types, including chondrocytes, osteoblasts 
and marrow stromal cells. The emerging concept postulates 
that there exists a distinct type of skeletal stem cell that is closely 
associated with the growth plate“, which is a type of cartilaginous 
tissue that has critical roles in bone elongation®. The resting zone 
maintains the growth plate by expressing parathyroid hormone- 
related protein (PTHrP), which interacts with Indian hedgehog 
(Ihh) that is released from the hypertrophic zone™", and provides a 
source of other chondrocytes''. However, the identity of skeletal stem 
cells and how they are maintained in the growth plate are unknown. 
Here we show, in a mouse model, that skeletal stem cells are formed 
among PTHrP-positive chondrocytes within the resting zone of the 
postnatal growth plate. PTHrP-positive chondrocytes expressed a 
panel of markers for skeletal stem and progenitor cells, and uniquely 
possessed the properties of skeletal stem cells in cultured conditions. 
Cell-lineage analysis revealed that PTHrP-positive chondrocytes 
in the resting zone continued to form columnar chondrocytes in 
the long term; these chondrocytes underwent hypertrophy, and 
became osteoblasts and marrow stromal cells beneath the growth 
plate. Transit-amplifying chondrocytes in the proliferating 
zone—which was concertedly maintained by a forward signal 
from undifferentiated cells (PTHrP) and a reverse signal from 
hypertrophic cells (Ihh)—provided instructive cues to maintain the 
cell fates of PTHrP-positive chondrocytes in the resting zone. Our 
findings unravel a type of somatic stem cell that is initially unipotent 
and acquires multipotency at the post-mitotic stage, underscoring 
the malleable nature of the skeletal cell lineage. This system provides 
a model in which functionally dedicated stem cells and their niches 
are specified postnatally, and maintained throughout tissue growth 
by a tight feedback regulation system. 

We first defined the formation of PTHrP* chondrocytes in the 
growth plate using a Pthrp-mCherry (Pthrp is also known as Pthlh) 
knock-in reporter allele (Extended Data Fig. 1a, see also Supplementary 
Information). During the fetal stage, PTHrP-mCherry* cells were 
mitotically active and localized within the Sox9* perichondrial region 
(Extended Data Fig. 1b). Although this pattern continued at birth 
(Fig. 1a), a distinct group of PTHrP-mCherry* chondrocytes appeared 
in the central area of the growth plate that is devoid of proliferation at 
postnatal day (P)3 (Extended Data Fig. 1c). These PTHrP-mCherry* 
chondrocytes increased markedly in number between P6 and P9, 
and occupied a well-defined zone in the growth plate (Fig. 1b-d, 
Extended Data Fig. 1c); these chondrocytes were less proliferative 
than their counterparts in the proliferating zone (EdU*; 6.1 + 2.3% of 
mCherry* cells versus 30.5 + 3.2% of proliferating chondrocytes at P9, 
n=3 mice). Therefore, PTHrP-mCherry* chondrocytes in the resting 
zone (‘resting chondrocytes’) develop in the postnatal growth plate, 
which is closely associated with the formation of secondary ossifica- 
tion centres. Flow cytometry analysis revealed that PTHrP-mCherry* 
cells were exclusively found in the CD45"° cell population in the 


growth plate (Fig. le), and were completely absent in the CD45"°8 
population in bone and bone marrow cells (Extended Data Fig. 2a). 
PTHrP-mCherry™ cells in the growth plate did not express 
Collal(2.3kb)-GFP (Extended Data Fig. 2b), which indicates that 
PTHrP-mCherry is specifically expressed by growth-plate chondro- 
cytes but not by osteoblasts or bone marrow stromal cells. We next 
asked whether PTHrP-mCherry* resting chondrocytes express a 
panel of cell-surface markers for transplantable skeletal stem and 
progenitor cells*—particularly three subsets of skeletal stem and 
progenitor populations (integrin alpha V (CD51)tThy-1 (CD90) -); 
mouse skeletal stem cells (mSSCs) (CD105~CD200*), pre-bone, 
cartilage and stromal progenitors (pre-BCSPs) (CD105~CD2007 ), and 
bone, cartilage and stromal progenitors (BCSPs) (CD105*). A large 
majority of CD45 Terl119-CD317 growth-plate cells—including 
both mCherry~ and mCherry*~ fractions—were in a CD51*CD907 
skeletal stem and progenitor population (Fig. 1f, left panels). Among 
CD45~ Terl119~CD31~-CD51*CD90- mCherry* cells, 49.2 + 8.4%, 
23.4+ 8.4% and 27.44 16.5% were CD105 CD200* (mSSCs), 
CD105~CD200- (pre-BCSPs) and CD105+ (BCSPs), respectively 
(Fig. 1f, right panels; see also Extended Data Fig. 2c, d). Conversely, 
41.6+4.4%, 31.7+6.2% and 53.4+ 16.9% of mSSCs, pre-BCSPs and 
BCSPs, respectively, were positive for PTHrP-mCherry (Extended Data 
Fig. 2e). Therefore, PTHrP-mCherry” resting chondrocytes represent a 
substantial subset of immunophenotypically defined skeletal stem and 
progenitor cells in the growth plate. 

We next determined whether PTHrP* resting chondrocytes 
behave as stem cells in vivo, by using a Pthrp-creER bacterial arti- 
ficial chromosome transgenic line (L909, Extended Data Fig. 3a; 
see also Supplementary Information, Supplementary Methods and 
Extended Data Fig. 10 for establishment of this system and validation 
of tamoxifen-negative controls). Analysis of Pthrp™"""”’+ ;Pthrp- 
creER;R26R“°°""" mice revealed that ZsGreen* cells largely overlapped 
with mCherry* cells shortly after a tamoxifen pulse at P6 (Extended 
Data Fig. 3b-d). The percentage of CD105* cells within the ZsGreen* 
cell population was significantly lower than that within the mCherryT 
cell population (Extended Data Fig. 3e), which indicates that Pthrp- 
creER preferentially marks an immature subset of PTHrP-mCherry* 
cells. An EdU label-exclusion assay of Pthrp-creER;R26R'!°"™” mice 
pulsed with tamoxifen at P6 revealed that a large majority of tdTo- 
mato? cells were resistant to EdU incorporation (Extended Data 
Fig. 3f, EdU*; 7.7 42.0% of tdTomato*~ cells versus 61.1 + 11.5% of 
proliferating-zone chondrocytes, n =3 mice), which demonstrates that 
Pthrp-creER specifically marks resting chondrocytes (Extended Data 
Fig. 3g). These PTHrP* resting chondrocytes did not express Grem14 
(Extended Data Fig. 3h). Subsequently, we traced the fate of PTHrP* 
resting chondrocytes labelled on P6 (hereafter, PTHrP©"-P6 cells) 
in vivo. After remaining within the resting zone at P12 (Fig. 2a; see also 
Extended Data Fig. 3g), PTHrP“®-P6 cells first formed short columns 
(composed of <10 cells) (Fig. 2b, arrowhead) and subsequently formed 
longer columns (composed of >10 cells) that originated from the 
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Fig. 1 | Formation of PTHrP-mCherry* chondrocytes in the resting 
zone of the growth plate. a—c, Pthrp™“"""""+ distal-femur growth 
plates with EdU administration shortly before analysis. Bottom panels 
show magnified views of central growth plates. RZ, resting zone; 

PZ, proliferating zone; GP, growth plate; POC, primary ossification 
centre; SOC, secondary ossification centre. Grey, DAPI and DIC. Scale 
bars, 200 jm (top panels), 50 zm (bottom panels). d, Quantification 
of mCherry* cells. n =3 mice per group, data are presented as 


resting zone, at around P18 (Fig. 2c, arrows). After a month of chase, 
PTHrP“-P6 cells constituted the entire column from the resting zone 
to the hypertrophic zone (Fig. 2d). The number of tdTomato* resting 
chondrocytes transiently increased during the first week of chase and 
decreased thereafter, owing to the formation of columnar chondrocytes 
(Fig. 2e). The number of short tdTomato* columns peaked at P18 and 
decreased thereafter, whereas long td Tomato* columns appeared at 
P18 and continued to increase until P36 (Fig. 2f). Thus, Pthrp-creER* 
resting chondrocytes stay within the resting zone for the first week, 
and establish columnar chondrocytes starting from the second week of 
chase. Analysis of Pthrp-creER;R26R@"" mice revealed that each col- 
umn was marked by its unique colour (CFP, YFP or tdTomato, Fig. 2g), 
which demonstrates that single Pthrp-creER* resting chondrocytes 
can give rise to multiple oe of chondrocytes. Additional analysis 
of Col2a1-creER;R26R@' mice further supported the existence of 
clonal cell populations (Extended Data Fig. 4a). Together, these findings 
support the notion that individual PTHrPt resting chondrocytes are 
multipotent and can clonally establish columnar chondrocytes in the 
growth plate. 

To investigate whether Pthrp-creER* resting chondrocytes undergo 
self-renewing asymmetric divisions, we performed an EdU label- 
retention assay. Analysis of PTHrP“-P6 cells with serial pulses of EdU 
revealed that, after three weeks of chase, these cells gradually diluted 
the EdU signal as they differentiated towards the hypertrophic zone 
(Fig. 2h). Further, PTHrP©®-P6 cells in the resting zone expressed 
PTHrP-mCherry, whereas those in the proliferating zone lost this 
expression (Fig. 2i). Therefore, Pthrp-creER* chondrocytes main- 
tain themselves in the resting zone as PTHrP™ cells and become the 
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mean +s.d. e, Flow cytometry analysis of Pthrp™“'"""”’+ growth-plate 
cells. n= 8 mice, data are presented as mean + s.d. f, Skeletal stem 

and progenitor cell-surface-marker analysis of Pthrp™ "+ growth- 
plate cells. mCherry~, mCherry~ fraction of Pthrp™""’* cells; 
mCherry*, mCherry* fraction of Pthrp™"""”’+ cells. Magenta box, 
CD45" Ter119- CD31~-CD51*CD90°- mCherry* fraction. n =3 mice per 
group, data are presented as mean + s.d. 


source of columnar chondrocytes in the growth plate, by providing 
the transit-amplifying progeny. Analysis of Pthrp-creER;R26R'4?""° 
mice after being pulsed at various preceding pre-natal and early post- 
natal time points revealed that Pthrp-creER* chondrocytes started to 
be formed within the resting zone at embryonic day (E)17.5 (Extended 
Data Fig. 4b-e); a tamoxifen pulse on a later day laterally expanded the 
domain of tdTomato™ cells. However, once they were marked, tdTo- 
mato? cells did not expand laterally upon further chase (Extended 
Data Fig. 4f,g), which indicates that PTHrPt resting chondrocytes are 
dedicated—at least to some degree—to making columnar chondro- 
cytes longitudinally. Additional analysis of Dlx5-creER;R26R“@?™” 
mice revealed that chondrocytes in the proliferating and hypertrophic 
zone could only form short columns (<10 cells) that eventually disap- 
peared from the growth plate (Extended Data Fig. 5a-d), indicating that 
Dlx5-creER* proliferating chondrocytes are not the source of columnar 
chondrocytes in the growth plate. 

During an extended chase period, PTHrP**-P6 cells continued to 
form columnar chondrocytes within the growth plate for at least a 
year after the pulse (Fig. 3a—c for Colla1(2.3kb)-GFP; Extended Data 
Fig. 6a-d for Cxcl12-GFP’”): the number of tdTomato* columns in the 
growth plate gradually decreased until six months after the pulse, and 
reached a plateau thereafter (Fig. 3d). A majority of tdTomato* col- 
umns extended beyond the hypertrophic layer and continued into the 
primary spongiosa and the metaphyseal bone marrow, an area beneath 
the growth plate’. These chondrocytes became Cxcl12-GFP* stromal 
cells beneath tdTomato* columns (Extended Data Fig. 6e), and retic- 
ular cells near trabecular bones (Fig. 3a, bottom). These chondrocytes 
also became Collal(2.3kb)—GEFP* osteoblasts on the trabecular surface 
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Fig. 2 | Pthrp-creER* resting chondrocytes are the source of 

columnar chondrocytes. a-d, Cell-fate analysis of Pthrp-creER* resting 
chondrocytes. Colla1(2.3kb )-GFP;Pthrp-creER;R26R'410" (pulsed on P6) 
distal-femur growth plates. Arrowhead, short column (<10 cells); arrows, 
long columns (>10 cells). Scale bars, 200 jum. e, f, Quantification of 
tdTomato* cells in resting zone (red line) (e) and columns in growth 

plate, short columns (<10 cells, green line) and long columns (>10 

cells, blue line) (f). n=5 (P9), n=3 (P12-P36) mice per group, data 

are presented as mean + s.d. g, In vivo clonal analysis of Pthrp-creER* 


(Fig. 3a, bottom) and in the primary spongiosa (Fig. 3b, bottom). The 
number of Cxcll12-GFP*ttdTomato™ stromal cells and Collal(2.3kb)- 
GFP*tdTomato* osteoblasts increased for the first three months of 
chase; subsequently, the number of Collal(2.3kb)-GFP*tdTomato* 
osteoblasts decreased, whereas the number of Cxcl12-GFP*tdTomatot 
stromal cells reached a plateau (Fig. 3e). These cells did not become 
bone marrow adipocytes in the presence of a high-fat diet that con- 
tained a PPAR-» agonist rosiglitazone (LipidTOX*, 0 out of 443 cells 
examined; Extended Data Fig. 6f). Therefore, a subset of Pthrp-creERt 
resting chondrocytes can continue to reproduce themselves within the 
resting zone in the long term; their descendants first differentiate into 
hypertrophic chondrocytes within the growth plate, and then become 
multiple types of cells beyond the growth plate, such as osteoblasts and 
bone marrow stromal cells—but not adipocytes—in vivo. 

We next performed a colony-forming assay to test whether Pthrp- 
creER* resting chondrocytes behave as skeletal stem cells in cultured 
conditions'+!°, PTHrP@-P6 cells formed distinct and large tdTomatot 
colonies (>50 cells) composed of small Sox9* spherical cells (about 
20 jum in diameter) (Extended Data Fig. 7a, b). By contrast, DIx5* 
proliferating chondrocytes labelled on P7 failed to form tdTomatoT 
colonies (Extended Data Fig. 7b, right), which indicates that Pthrp- 
creER* resting chondrocytes uniquely possess the capacity to form col- 
onies when cultured ex vivo (Extended Data Fig. 7c). We next isolated 
individual primary PTHrP©®-tdTomato* colonies and sub-cultured 
them further to determine whether individual colony-forming cells 
can self-renew in vitro (Extended Data Fig. 7d, see also Supplementary 
Information). Although a small fraction of P9 PTHrP©*-tdTomato* 
primary colonies had the ability to establish secondary colonies 
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resting chondrocytes. Pthrp-creER;R26R©"/*" distal-femur growth plates 
(pulsed on P6, P7 and P8). 4-OHT, 4-hydroxytamoxifen. Scale bars, 50 pm. 
n=3 mice. h, EdU label-retention assay of Pthrp-creER;R26R'4?"a" 
distal-femur growth plates (pulsed on P6, P7 and P8). Arrowheads, 
EdU-retaining tdTomato* cells. Scale bars, 50 ym. n = 3 mice. i, PTHrP- 
mCherry expression in Pthrp-creER;R26R2°°"""; Pthrp™""""+ distal-femur 
growth plates (pulsed on P6). Arrowheads, PTHrP-mCherrytZsGreen* 
cells. Scale bars, 20 j1m. Grey, DAPI and DIC. n=3 mice. 


(17 out of 518 clones, 3.3%), none of them could survive a further passage 
(Extended Data Fig. 7e). By contrast, an increased fraction of P12 
PTHrP©-tdTomato* colonies established secondary colonies (16 out 
of 98 clones, 16.3%), and a fraction of these clones (2 out of 16 clones, 
12.5%) could be further passaged for at least nine generations (Fig. 4a). 
Thus, Pthrp-creER* colony-forming cells appear to acquire robust in 
vitro self-renewability when the secondary ossification centre actively 
develops. Further, individual PTHrP©®-tdTomatot cells (passage 4-7) 
could generate Alcian blue* spheres, Alizarin red* mineralized matrix 
and LipidTOX* oil droplets under chondrogenic, osteogenic and adi- 
pogenic differentiation conditions, respectively (Figs. 4b, 4 out of 4 
clones, 100%). Upon subcutaneous transplantation into immunodefi- 
cient mice, these cells robustly differentiated into Colla1(2.3kb)-GFPT 
osteoblastic cells (Fig. 4c) and effectively gave rise to Alcian bluet and 
Alizarin red* matrix, but produced Oil red O* lipid droplets only inef- 
fectively (Extended Data Fig. 7f). These findings indicate that PTHrPT 
skeletal stem cells are predisposed to become chondrocytes and oste- 
oblasts in vivo, and possess a baseline potential to become adipocytes 
in an inductive condition in vitro. 

Lastly, we set out to investigate the functional importance of 
PTHrP* resting chondrocytes. Inducible cell ablation experiments 
usin Pthrp-creER;R26's!t4Tomato/-+ (control) and Pthrp-creER; 
R26! t4TomatoiDTA (hereafter, DTA) littermates revealed that Pthrp- 
creER* cells were only incompletely ablated; tdTomato* resting 
chondrocytes and columns were still observed in the induced tissue 
of DTA mice (Fig. 5a, b). Nonetheless, the height of each layer of the 
growth plate was altered in the induced tissue of DTA mice, in which 
the proliferating zone was significantly reduced in association with the 
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Fig. 3 | Pthrp-creER* resting chondrocytes behave as skeletal stem cells 
in vivo. a—c, Long-chase analysis of Pthrp-creER™ resting chondrocytes. 
Collal(. 2.3kb)-GFP;Pthrp-creER;R26R'110m0 distal femurs (pulsed on P6). 
Ina, b, the bottom panel shows a magnified view of marrow space 

(white box in top panel). Arrowheads, Collal(2.3kb)—-GFP*ttdTomato* 
osteoblasts; asterisks, tdTomatot reticular stromal cells. Grey, DAPI and 
DIC. Scale bars, 500 jum (top panels), 50 zm (bottom panels). n =3 mice 
per group, except in b, n=1 mouse. d, Quantification of tdTomatot 
columns in growth plate (red line) during the chase. n = 8 (1 month, 


significant expansion of the hypertrophic and resting zones (Fig. 5c). 
Therefore, partial loss of PTHrP* cells in the resting zone is suffi- 
cient to alter the integrity of the growth plate by inducing premature 
hypertrophic differentiation of chondrocytes in the proliferating zone. 
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Fig. 4 | Skeletal stem cell activities of Pthrp-creER™ resting chondrocytes 
ex vivo. a, Colony-forming assay and subsequent passaging of individual 
PTHrPC*-tdTomato* colonies. Inset, magnified view of single colony. Red, 
tdTomato. Scale bars, 5 mm, 1 mm (inset). LT-SSCs, long-term skeletal 
stem cells. n = 98 independent experiments. b, Trilineage differentiation 
of PTHrP©'-tdTomato* clones (passage 4 to 7). Chondrogenic (left), 
osteogenic (centre) and adipogenic (right) differentiation conditions. 
Insets, differentiation-medium negative controls. ITS, insulin-transferrin- 
selenium, OM, osteogenic differentiation medium. Four independent 
clones were tested. c, Subcutaneous transplantation of PTHrPC- 
tdTomatot clones into immunodeficient mice. Dotted line, contour of 

the plug. Grey, DIC. Scale bars, 1 mm. n=8 mice. 
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(1 month, 2 months, 3 months for Collal(2.3kb)-GFP and Cxcl12—GFP, 

6 months for Cxcl12—GFP), n = 4 (12 months for Collal(2.3kb)-GFP), 
n=2 (12 months for Cxcl12-GFP) mice per group, data are presented as 
mean +s.d., n= 1 (6 months for Collal(2.3kb)—GFP) mouse. 


Moreover, global manipulation of Hedgehog (Hh) signalling using Smo 
agonist (SAG) and antagonist (LDE225) in Pthrp-creER;R26R@0m 
mice pulsed on P6 revealed that these regimens predominantly affected 
chondrocytes in the proliferating zone, without directly affecting 
PTHrP“-P6 cells in the resting zone (Extended Data Fig. 8a—-c). Both 
regimens resulted in a significantly reduced number of tdTomatot 
columns (Fig. 5d; see also Extended Data Fig. 8d—k), indicating that 
uninterrupted Hh signalling is essential to maintaining the proper 
cell fates of PTHrP* resting chondrocytes. Pthrp-creER* cells directly 
differentiated into Collal(2.3kb)—-GFP* osteoblasts in response to 
micro-perforation injury (Extended Data Fig. 81, m), which indicates 
that PTHrP* skeletal stem cells lose their physiological fate in the 
absence of an intact proliferating zone. 

Here we identified that the resting zone of the growth plate houses 
a unique class of skeletal stem cells, the transit-amplifying progeny of 
which are lineage-restricted as chondrocytes that exhibit multipotency 
only at the post-mitotic stage (see Extended Data Fig. 9a, b). PTHrP* 
cells are one of the stem-cell subgroups organized within the resting 
zone and—together with other as-yet unidentified cells—these cells 
can concertedly contribute to long-term tissue renewal. PTHrP* skel- 
etal stem cells are dedicated to making columnar chondrocytes lon- 
gitudinally, and appear to derive from PTHrP~ cells. PTHrP* stem 
cells are highly hierarchical; approximately 2-3% of these cells acquire 
long-term self-renewability (Extended Data Fig. 9b). In addition, 
these stem cells are endowed with the ability to maintain the integrity 
of the growth plate, by sending a forward signal (that is, PTHrP) for 
transit-amplifying chondrocytes to maintain their proliferation and 
delay their hypertrophy in a non-cell autonomous manner. Therefore, 
PTHrP* stem cells can also provide the niche for transit-amplifying 
cells, which is compatible with a model previously proposed for the epi- 
thelium!®. Conversely, transit-amplifying cells—which are maintained 
in a Hedgehog-responsive manner—appear to provide instructive cues 
to determine the cell fates of PTHrP* stem cells within the growth 
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Fig. 5 | Reciprocal interactions between PTHrP-creER™ resting 
chondrocytes and their niche. a~c, DTA-mediated ablation of Pthrp- 
creER* resting chondrocytes. a, Pthrp-creER;R26'"!4%matol+ (control). 

b, Pthrp-creER;R26'"!4TematoiDTA (TTA) distal-femur growth plates (pulsed 
on P6). HZ, hypertrophic zone. Grey, DAPI and DIC. Right panels, 
haematoxylin and eosin staining. Scale bars, 200 jum (left panels) and 

100 jm (right panels). c, Quantification of resting (left), proliferating 
(centre) and hypertrophic (right) zone height. TOM, tdTomato. n=5 

mice for control, n=7 mice for DTA, data are presented as mean + s.d., 

P values from Mann-Whitney’s U-test, two-tailed. d, Pharmacological 


plate, which implies a reciprocal interaction between the stem cells and 
their progeny. We assume that PTHrP~ short-term precursors are the 
principal driver for extensive bone growth that occurs during postna- 
tal development, reminiscent of a model proposed for haematopoietic 
stem cells'”!®. It is possible that PTHrP* skeletal stem cells are mainly 
involved in the long-term maintenance of skeletal integrity, although 
further details need to be clarified. 
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Extended Data Fig. 1 | Generation and characterization of Pthrp- 
mCherry knock-in allele. a, CRISPR-Cas9 generation of Pthrp-mCherry 
knock-in allele. Structure of the genomic Pthrp locus, targeting vector 
and knock-in allele after homologous recombination. White boxes, 
untranslated region; black boxes, coding region; ex, exon. Blue bars, 
homology arms; red bars, guide RNAs (gRNAs) as part of CRISPR-Cas9 
reagents; red boxes, Kozak-mCherry-bGHpA cassette replacing the native 
start codon. Half arrows, primers; wild-type forward (289), wild-type 
reverse (290) and mutant reverse (291). Bottom, PCR genotyping using 
289, 290 and 291 primer mix; wild-type (WT) allele, 185 bp; knock-in 


(KI) allele, 385 bp. At least n = 100 independent experiments with similar 
results. b, Pthrpmcrer! + fetal distal femurs with EdU administration 
shortly before analysis (3 h). Bottom panels show magnified views of 
perichondrium. Dotted lines, borders of bone anlage. Grey, DAPI and 
DIC. Scale bars, 200 jm (top panels), 100 zm (bottom panels). n= 2 
(E13.5, E15.5) mice, n= 1 (a-Sox9) mouse. c, Pthrp™"’* distal- 

femur growth plates with EdU administration shortly before analysis 

(3 h). Bottom panels show magnified views of central growth plates. 
Arrowheads, mCherry* cells. Grey, DAPI and DIC. Scale bars, 200 jum 
(top panels), 50 xm (bottom panels). 
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Extended Data Fig. 3 | Generation and characterization of Pthrp- 
creER bacterial artificial chromosome transgenic line. a, Generation 
of Pthrp-creER bacterial artificial chromosome (BAC) transgenic mice. 
Structure of the Pthrp-creER- WPRE-rGHpA BAC construct. Kozak-Pthrp- 
creER-WPRE-rGHpA-frt-Neo®-rt cassette containing 62-bp homology 
arms was recombineered into a BAC clone RP23-27F7 containing 131-kb 
upstream and 82-kb downstream genomic sequences of the Pthrp 

gene. Neo® and backbone lox sites were removed before pronuclear 
injection. Half arrows, forward (62) and reverse (63) primers. Right, 
PCR genotyping using 62 and 63 primer mix; transgenic (Tg), 373 bp. 
White boxes, exons; black boxes, introns. At least n = 100 independent 
experiments with similar results. b, Short-chase analysis of Pthrp- 


creER;R26R20°; Pthr pers! * distal-femur growth plates (pulsed on P6). 


Scale bars, 50 jm. n =3 mice. c-e, Short-chase flow cytometry analysis of 
Pthrp-creER;R26R2°°"; Pthrp™“"""/+ growth-plate cells, with tamoxifen 
injection at 72 h (c, e) or 22 h (d) in advance. Red lines, ZsGreen* cells; 


blue lines, control cells without PTHrP-mCherry. n =5 mice (72 h) 

or n=3 mice (22 h) per group. e, Percentage of CD105* cells within 
mCherry* (red) and ZsGreen* (green) cells. n=5 mice per group, data are 
presented as mean +s.d., *P=0.012, Mann-Whitney’s U-test, two-tailed. 
f, Pthrp-creER;R26R'4?ma" distal-femur growth plates (pulsed on P6) at 
P9. EdU (50 1g) was serially injected 9 times at 8-h intervals between P6 
and P9. Grey, DIC. Scale bars: 50 ym. n= 3 mice. g, Scanning of Pthrp- 
creER;R26R'¢®"4° whole femur (pulsed on P6) at P12. Arrow, tdTomato* 
cells localized within the resting zone of distal femur. Grey, DAPI and 
DIC. Scale bars, 1 mm. n= 3 mice. h, High sensitivity in situ hybridization 
(RNAscope) analysis of Pthrp-creER;R26R“™"” distal-femur growth 
plates (pulsed on P6) at P12. Top and bottom panels represent the identical 
section, before (bottom panels) and after (top panels) hybridization. 

Left panels, Col2a1 (positive control); centre panels, Grem1; right 

panels, negative control. Grey, DAPI and DIC. Scale bars, 200 xm. 

n=3 independent experiments. 
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Extended Data Fig. 7 | Pthrp-creER™ resting chondrocytes c, Quantification of tdTomato* colonies (>50 cells) established from 
uniquely possess colony-forming capabilities ex vivo. a, Diagram Pthrp-creER;R26R@%™"*" (n = 88) and Dlx5-creER;R26R4"” (n =5) 
of colony-forming assay. Growth-plate cells were isolated from Pthrp- mice. Data are presented as mean + s.d. d, Diagram of colony-forming 
creER;R26R4™™° (pulsed on P6) or Dix5-creER;R26R“™"*"° (pulsed assay and subsequent analyses on self-renewal, trilineage differentiation 
on P7) mice at P9, and cultured at a clonal density (~1,000 cells per cm?) and transplantation of individual colony-forming cells. e, Isolation of 
for 10-14 days to initiate colony formation. BM, bone marrow. single PTHrP“#-tdTomato* colonies and subsequent subculture of 
b, Colony-forming assay. Left top, Pthrp-creER;R26R4"*"; right, isolated clones. A, exhausting clone; B, self-renewing clone establishing 
DIx5-creER;R26R'4™"#'0_ Insets 1,2 and 3 show magnified views of the secondary colonies. Right, clone B did not proliferate at passage 2 upon 
corresponding areas (labelled with 1, 2, 3). Bottom left, Sox9 staining of bulk culture. Red, tdTomato. Scale bars, 5 mm. n=518 independent 
primary Pthrp-creER tdTomato* colonies. Red, tdTomato. Scale bars, experiments. f, Subcutaneous transplantation of PTHrP©#-tdTomato* 


5 mm (top panels), 1 mm (top panel insets), 200 1m (bottom panel).n=88 clones into immunodeficient mice. n=8 mice. 
mice for Pthrp-creER;R26R@™™, n= 5 for Dix5-creER;R26R'40™, 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | Pthrp-creER* resting chondrocytes form 
columnar chondrocytes in a Hedgehog-responsive, niche-dependent 
manner. a-i, Pharmacological manipulation of Hedgehog signalling. 
Pthrp-creER;R26R™"° distal-femur growth plates (pulsed on P6). Left 
panels, vehicle control; centre panels, SAG (Hh agonist)-treated samples; 
right panels, LDE225 (Hh antagonist)-treated samples. Grey, DAPI and 
DIC. Scale bars, 200 tum. j, k, Quantification of tdTomato* columns in 
Pthrp-creER;R26R@?™"° distal-femur growth plates (pulsed on P6). 
P17,n=5 (control), n=5 (SAG), n= 4 (LDE225) mice per group. P28, 
n=4 (control), n= 3 (LDE225) mice per group. Data are presented as 
mean +s.d. P28, n=2 (SAG). ***P < 0.001; P17 control versus SAG, 


mean difference = 67.8, 95% confidence interval (37.5, 98.1); P17 

control versus LDE225, mean difference = 66.0, 95% confidence interval 
(33.9, 98.0); P17 SAG versus LDE225, mean difference = —1.85, 95% 
confidence interval (—33.9, 30.2); P28 control versus LDE225, mean 
difference = 134.5, 95% confidence interval (108.7, 160.3). One-way 
ANOVA followed by Tukey’s multiple comparison test. 1, m, Micro-perforation 
injury of growth plates. Col1a1(2.3kb)-GFP;Pthrp-creER;R26R4?"” distal 
femurs (pulsed on P6) at P28. Micro-perforation surgery was performed 

at P21. 1, Left femur growth plate (control). m, Right femur growth plate 
(micro-perforated). Dotted line, micro-perforated area. Grey, DAPI and 
DIC. Scale bars, 100 um. n = 3 mice. 
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Extended Data Fig. 9 | Resting zone of the growth plate contains a 
unique class of skeletal stem cells. a, Formation of PTHrP* skeletal stem 
cells within the growth plate. A small subset of PTHrP*t chondrocytes 

in the resting zone acquire properties as long-term skeletal stem cells 

in conjunction with the formation of the highly vascularized secondary 
ossification centre. b, PTHrP* skeletal stem cells are heterogeneously 
composed of long-term, short-term and transient populations, and 
undergo asymmetric divisions and maintain themselves within the 
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resting zone. These cells may be supplemented by PTHrP~ cells. PTHrP* 
cells perform two different functions: (1) these cells differentiate into 
proliferating chondrocytes, hypertrophic chondrocytes and eventually 
become osteoblasts and bone marrow stromal cells at the post-mitotic 
stage. (2) These cells send a forward signal (PTHrP) to control 
chondrocyte proliferation and differentiation. Indian hedgehog (Ihh) 
secreted by hypertrophic chondrocytes maintains the proliferation of 
chondrocytes and formation of columnar chondrocytes. 
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Extended Data Fig. 10 | Absence of tamoxifen-independent 
recombination in Pthrp-creER line. a, No tamoxifen controls of Pthrp- 
creER;R26R'4!™'0 mice at 6 months (left) and 1 year (right) of age. Red, 
tdTomato; blue, DAPI; grey, DIC. Scale bars, 500 jum. n =3 mice per 
group. b, No tamoxifen controls of primary colonies (passage 0) isolated 
from Pthrp-creER;R26R“""” mice at P12 without tamoxifen injection. 
Left, methylene blue (MB) staining; right, red tdTomato (TOM). Scale 
bar, 5 mm. n=3 mice. c, Dose-response curve of recombination based 
on Pthrp-creER. Quantification of tdTomato* cells in resting zone at P9 in 
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Col(2.3kb) PT 


Pthrp-creER;R26R“”™” mice upon a single dose of tamoxifen at P6. 

x axis, dose of tamoxifen (\1g); y axis, the number of tdTomatot cells per 
1-mm thickness. n= 3 (0, 31.3 and 62.5 jig), n= 4 (15.6, 125, 250 and 500 1g) 
mice per group, data are presented as mean + s.d. d, Tamoxifen-induced 
recombination in growth plates pulsed on P9. Pthrp-creER;R26R@"™™ 
distal-femur growth plates at P12 (left) and Colla1(2.3kb)-GFP; 
Pthrp-creER;R26R“!™” mice at P21 (right). Tamoxifen (500 jig) was 
injected at P9. Green, Colla1(2.3kb)-GFP; red, tdTomato; grey, DAPI 

and DIC. Scale bars, 200 um. n =3 mice. 
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Nitrogen is an essential macronutrient for plant growth and basic 
metabolic processes. The application of nitrogen-containing 
fertilizer increases yield, which has been a substantial factor in the 
green revolution!. Ecologically, however, excessive application of 
fertilizer has disastrous effects such as eutrophication’. A better 
understanding of how plants regulate nitrogen metabolism is 
critical to increase plant yield and reduce fertilizer overuse. Here 
we present a transcriptional regulatory network and twenty-one 
transcription factors that regulate the architecture of root and shoot 
systems in response to changes in nitrogen availability. Genetic 
perturbation of a subset of these transcription factors revealed 
coordinate transcriptional regulation of enzymes involved in 
nitrogen metabolism. Transcriptional regulators in the network 
are transcriptionally modified by feedback via genetic perturbation 
of nitrogen metabolism. The network, genes and gene-regulatory 
modules identified here will prove critical to increasing agricultural 
productivity. 

The root system takes up and metabolizes bio-available nitrogen 
and transduces nitrogen signals. In response to reduced nitrogen 
availability, plant development is adjusted—this includes increased 
lateral root elongation to forage for nitrogen’. Above ground, rosette 
size is decreased and plants flower earlier*. Diverse molecular events 
underlie these morphological changes. Nitrogen transporters, assim- 
ilation enzymes and signalling factors are transcriptionally regulated 
in response to changes in available nitrogen®. Post-transcriptional, 
calcium- and phosphorylation-dependent signalling cascades are also 
critical regulators of this transcriptional response®. Concomitantly, 
carbon metabolism and hormone pathways are also altered to adjust 
metabolic pathways and plant growth’. Sixteen transcription factors in 
Arabidopsis thaliana have previously been identified to have a role in 
nitrogen metabolism*? (Supplementary Table 1), through a range of 
approaches that includes systems-level studies!®°. Despite the impor- 
tance of the root system in regulating responses to nitrogen, only seven 
of these transcription factors have previously been shown to regulate 
root development in a nitrogen-dependent manner®”!!14-17, 

Using enhanced yeast one-hybrid assays, we screened for transcrip- 
tion factors that regulate nitrogen metabolism*’”*. Because nitrogen 
metabolism is interconnected with a range of different processes, 
we included target promoters from genes associated with nitrogen 
transport (12 promoters), assimilation (11 promoters), signalling (2 
promoters), connections to nitrogen metabolism through amino acid 
metabolism (5 promoters), carbon metabolism (10 promoters), carbon 
transport (4 promoters), organ growth (5 promoters) and hormone 
responses (7 promoters) as well as associated transcription factors (12 
promoters) (Supplementary Table 2). We screened these promoters 
against transcription factors expressed in roots. The resulting network 
comprises 1,660 interactions between 431 genes, 345 transcription fac- 
tors and 98 promoters (Fig. la, Extended Data Fig. 1, Supplementary 
Table 3a). We call this network the ‘yeast one-hybrid network for 


nitrogen-associated metabolism’ (YNM). Our assays captured previ- 
ously characterized interactions: NLP7 physically binds to and regu- 
lates expression of NIR1 and CIPK8, and NLP6 binds to and regulates 
expression of NIR1‘>3. Within the YNM we found what is, to our 
knowledge, a previously undescribed putative hierarchical regulation 
of transcription factors—including both known nitrogen-regulatory 
transcription factors and transcription factors identified in this study— 
that bind to promoters of genes in many processes, such as the nitrate 
assimilation pathway (Fig. 1b, Supplementary Table 3b). A signalling 
cascade that links the nitrate-mediated regulation of Ca**-sensor pro- 
tein kinases to transcriptional regulation via NLP7 is also significantly 
overrepresented in the YNM° (P = 2.14 x 107°) (Extended Data Fig. 2a, 
Supplementary Table 4a). Moreover, the YNM is enriched for hormone- 
regulated genes, which supports previous findings that hormone sig- 
nalling is integrated into the regulation of nitrogen metabolism®”* 
(see Methods, Extended Data Figs. 2b-h, Supplementary Table 4b-h). 
The highly combinatorial nature of interactions is consistent with previ- 
ous studies that suggest that transcription factors that are central within 
the YNM may regulate multiple processes that are related to nitrogen 
metabolism’, NLP7 bound to promoters of seven nitrogen-associated 
categories (Supplementary Table 3c). One hundred and seventy-five 
transcription factors from the YNM were found to bind to gene pro- 
moters that are involved in more than one nitrogen-associated process 
(Supplementary Table 2d). 

We used a variety of datasets and approaches to rank transcrip- 
tion factors in the YNM for functional validation. First, under the 
premise that transcription factors and their targets are co-expressed 
upon changes in nitrogen availability, we prioritized highly correlated 
transcription factors and targets for a nitrogen treatment and a cell- 
type-specific dataset (Supplementary Tables 5, 6). This approach does 
not exclude the possibility of detecting self-regulating repressors or 
activators. Second, we used the network analysis algorithm NeCorr 
(see Methods, Supplementary Table 7). Third, transcription fac- 
tors were evaluated for their outgoing connectivity (Supplementary 
Table 3e). Additionally, transcription factors were considered given 
the total number and percentage of targets that are classical nitrogen- 
metabolism genes (Supplementary Table 8). As a positive control, 
we included mutants of the transcription factors NLP7 and GNC, 
and the transceptor NPF6.3 (also known as NRT1.1 or CHL1)!1>6, 
Perturbation of nitrogen metabolism in npf6.3 (also known as nrt1.1 
or chl1) and nrt2.1 plants alters lateral root initiation and/or lateral 
root elongation”””*, With the hypothesis that these transcription factors 
regulate nitrogen metabolism and nitrogen status, we examined their 
mutant root system architecture (RSA) (see Methods) under limiting 
(1 mM KNOs) and sufficient nitrogen (10 mM KNOs) (Extended Data 
Fig. 3, Supplementary Table 2c). 

Mutant alleles of seventeen genes that we identify here showed 
significant changes in at least one RSA trait relative to wild type 
(Supplementary Tables 9, 10, Supplementary Data 1, 2). chl1-5—a 
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Fig. 1 | Combinatorial interactions between transcription factors and 
promoters of genes associated with nitrogen metabolism, signalling 
and nitrogen-associated processes. a, Interaction network for nitrogen- 
associated metabolism. See Extended Data Fig. 1 for the full diagram, 
including gene names. Rectangles, promoters; ovals, transcription factors; 
and diamonds, genes represented as both promoters and transcription 
factors. Nitrogen-associated biological processes are indicated by promoter 
colour. A grey line indicates an interaction between transcription factor 
and promoter. Light green, nitrogen transporter; yellow, organ growth; 
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dark green, nitrate assimilation; light purple, nitrogen signalling; light 
blue, nitrogen-linked; orange, carbon metabolism; red, ethylene; dark blue, 
auxin; teal, carbon transporter; dark purple, amino acid metabolism; and 
pink, transcription factors linked to nitrogen. b, Transcription factor- 
promoter interactions that are associated with nitrate assimilation are 
hierarchical. Edges participating in hierarchical regulation going into the 
transcription factors (diamonds) are blue, and outgoing edges from the 
transcription factors are orange. The NLP7 and NLP6 regulators are in the 
first tier of transcription-factor binding to assimilation enzymes. 
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Fig. 2 | Phenotypes associated with transcription-factor mutant 

alleles. Mutant alleles are listed in rows and measured traits in columns. 
Statistically significant differences relative to wild type (Col-0) are shown 
with a coloured cell within the heat map (P< 0.05 as determined using a 
two-way ANOVA, exact n and P values for the analysis can be found in 
Supplementary Table 10). Trait categories are indicated with a dark-edged 
vertical line. Moving from left to right this comprises primary root length, 
lateral root number, total lateral root length, total root length, average 
lateral root length, lateral root density, ratio of total lateral root length to 
total root length, principal component analysis, rosette size and bolting 
and flowering analysis. Root traits were measured from 9-day-old plants 
grown on 1 mM KNO; or 10 mM KNO3. PRL, primary root length; LR, 
number of lateral roots; LRL, total lateral root length; total root length 
(TRL), PRL + LRL; average lateral root length (ALRL), LRL divided by 
LR; lateral root density (LRD), LR divided by PRL; LRL/TRL, LRL divided 
by TRL. ‘PR factor’ indicates that PRL was considered as a factor in the 
ANOVA model; PC1, principal component 1; PC2, principal component 2; 
PC3, principal component 3. Dark green, phenotype is larger than 

Col-0; light green, phenotype is smaller than to Col-0; horizontal black 
bar, genotype-by-condition interaction. Genotype-specific (light pink) 
and genotype-by-condition-specific (dark pink) effects are shown, when 
considering variation across all root traits in a principal component 
analysis in PC1, PC2 and PC3. Light blue, early bolting and flowering; dark 
blue, late bolting and flowering. Mutants are hierarchically clustered using 
the Manhattan distance metric. 


mutant of NPF6.3—displayed changes in its RSA that were depend- 
ent on genotype and on genotype-by-nitrate conditions (Figs. 2, 3c). 
Similarly, nlp7-1 and hmgb15-1 plants displayed larger root systems, 
with genotype-dependent changes in their RSA (Figs. 2, 3d, e). bbx 16-1 
plants had larger root systems under limiting nitrogen conditions 
(Figs. 2, 3f). Conversely, the myb29-1 mutant had increased lateral 
root length, lateral root density and total root length under sufficient 
nitrate, in a manner that was dependent on genotype-by-nitrate condi- 
tions (Figs. 2, 3g). By contrast, the erf107-1 and rav1-2 plants showed a 
genotype-dependent decrease in the size of traits related to their lateral 
roots, in both nitrate conditions (Figs. 2, 3h, i). The gnc mutant showed 
decreases in lateral root length that were dependent on nitrate condi- 
tions and on genotype-by-nitrate condition (Fig. 2). The phenotype 
of gnc plants differed from that of erf107-1 and rav2-1 plants, in that 
these latter two mutants did not show any dependence on nitrate con- 
ditions. The composite principal component traits provided additional 
insights into perturbations in root growth that could not be discerned 
by looking at individual traits (Extended Data Fig. 4, Supplementary 
Table 11). In these experiments, we determined genes that control 
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Fig. 3 | Total lateral root length phenotypes that are dependent on 
genotype and nitrate condition. a, Col-0 LRL is significantly longer at 

1 mM KNO; compared to 10 mM KNO;,. b, Average Col-0 root growth. 
Scale bar, 1 cm. ¢, d, The chl1-5 and nlp7-1 mutant alleles were included 

as a nitrate transceptor (chl1-5, c) and master transcriptional regulator 
(nlp7-1, d). Both mutants have a genotype-dependent influence on LRL at 1 
and 10 mM KNOs, with an LRL that is longer than that of wild type. chl1-5 
also has a genotype-by-treatment influence on LRL. e, The hmgb15-1 

allele shows a genotype-dependent influence on LRL relative to wild type, 
which is similar to nlp7-1 and chl1-5 relative to wild type. f, The bbx16-1 
allele has an influence on LRL that is dependent on nitrate condition, with 
a longer LRL only at 1 mM KNO3. g, The myb29-1 allele has an influence 
on LRL that is dependent on nitrate condition, with a longer LRL only 

at 10 mM KNO3. h, i, The rav2-1 (h) and erf107-1 (i) alleles are both 
genotype-dependent at both 1 mM and 10 mM KNO,, with shorter LRLs. 
*P < 0.05, two-way ANOVA; exact n and P values for the analysis can be 
found in Supplementary Table 10. Box plots are centred at the data median 
and mark from the 25th to the 75th percentile. Individual measurements 
are plotted as black dots. 


nitrogen-associated root length, lateral root development, and lateral 
root development that is dependent on primary root length, and then 
overlaid these on the YNM along with genes that regulate primary root 
length (Supplementary Table 12) and lateral root initiation’” (Extended 
Data Fig. 5). 

Given that perturbed RSA was observed in these mutants, we next 
determined whether the altered nitrogen status of the mutants affected 
shoot development and the transition from vegetative to reproduc- 
tive growth (see Methods). Mutant alleles of thirteen genes showed 
a difference in either rosette size and/or bolting and flowering time 
(Supplementary Tables 9, 10, Supplementary Data 1, 2). Plants with the 
arf18-2 allele had a smaller rosette with an increased number of days 
to flowering, whereas arf18-3 plants showed the opposite phenotype. 
A change in rosette size was coupled with a change in the time to 
bolting or flowering for four mutants. arf18-2, arf18-3 and hmgb15-1 
showed the most significant changes in both root and shoot system 
architecture. A significant reduction in both '°N in rav2-1 plants and 
in the C:N ratio in nlp7-1 plants was observed (Extended Data Fig. 6). 
Classical plant physiology experiments have also associated changes 
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Fig. 4 | The nitrate-responsive transcriptional regulatory network. 

a, The network is enriched for genes that are differentially expressed in the 
root grown on 1 mM KNOs; or 10 mM KNOs3. Nodes that are significantly 
differentially expressed are coloured according to their log(fold change) 
(log(FC)), from —2.5 to 2.5 (Supplementary Table 14a, b). Node shape is 
the same as Fig. 1. b, Heat map showing the expression of specific YNM 


in nitrogen status with perturbation of chlorophyll levels. nip7-1 and 
gnc mutants showed significant reduction in their total chlorophyll 
content, whereas [bd4-1 had increased chlorophyll content (Extended 
Data Fig. 7). Changes in shoot growth in the mutants were significantly 
correlated with the number of targets each transcription factor had in 
the YNM (Spearman rank correlation, P< 0.05), as well as with the 
number of biological processes that these transcription factors puta- 
tively regulated (P < 0.05, Supplementary Table 13). Thus, network 
connectivity is predictive of the influence of a given transcription factor 
on shoot growth. 

Changes in nitrogen availability are accompanied by changes in trans- 
cription®®!2°°, Furthermore, the changes in development of transcription- 
factor mutants under conditions of both limiting and sufficient 
nitrogen are probably coordinated by perturbations in the underlying 
transcriptional regulatory network. To link mutant phenotypes with 
transcriptional changes, whole-genome expression was measured in 
a subset of mutant genotypes (see Methods, Supplementary Tables 14, 
15). To provide further support that the YNM reflects the transcrip- 
tional regulation of nitrogen-dependent processes in the root, we 
tested for enrichment of nitrogen-status genes. Genes displaying differ- 
ential expression in wild-type roots in 1 mM relative to 10 mM KNO3 
were significantly enriched in the YNM (P =3.94 x 107°) (Fig. 4a). 
Thus, the YNM captures transcriptional regulation of root nitrogen 
status. 

At the level of individual transcription factors, ARF9 and ARF18 
alleles showed differential expression of nitrogen-related genes. ARF9 
regulates the expression of two direct targets as predicted by the YNM 
(XERICO and DUR3) as well as NRT2.4, NPF7.3, GLN2 and ASN2. 
ARF18 regulated expression of three direct targets as predicted by 
the YNM (NRT2.4, ANAC032 and XERICO) as well as ACS5, DUR3, 
G6PD3 and AMT1;2 (Fig. 4b). HMGB15 regulated the expression of 
one predicted direct target, XERICO (Fig. 4b). LBD38 is misregulated 
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target genes: DUR3, NRT2.4, RAV2, XERICO, ANAC032, AMT1;2, GLN2, 
ASN2, G6PD3, NPF7.3, ACS5 and LBD38 in 1 mM KNO; and 10 mM 
KNO3. Each cell represents the log(fold change) relative to the control, as 
determined using a two-sided test with limmaVoom. *Corrected P< 0.05, 
(Supplementary Table 15). Four biological replicates were sampled per 
genotype per condition. 


in the mutants of ARF18, MYB29, RAV2 and HAT22; misregulation 
of NRT2.4 was found in the mutants of ARF18, HAT22 and RAV2 
(Fig. 4b). 

A common mode of regulation in metabolism is metabolite feed- 
back. To test whether feedback is present within the YNM, we curated 
gene-expression datasets of nitrate transporters and a transceptor, 
metabolic-enzyme mutants and genotype-by-nitrogen-dependent 
changes in mutants of previously described transcriptional regulators 
of nitrogen metabolism (Supplementary Table 16, see Methods). Upon 
perturbation of nitrogen transport, sensing and metabolism, genes 
in the YNM were significantly enriched for differential expression 
(Fig. 5a). Thus, a perturbation in nitrate uptake, reduction and the 
glutamine oxoglutarate aminotransferase cycle results in transcrip- 
tional perturbation of enzymes involved in nitrogen metabolism, and 
their upstream regulators. Genetic perturbation of nitrogen metab- 
olism via the nitrogen-regulatory transcription factors also perturbs 
more genes in the YNM than expected by chance (Fig. 5a). Clustering 
analysis revealed targets of this metabolic feedback (Extended Data 
Figs. 8-10). A core set of enzymes involved in nitrogen metabolism— 
representing nearly every step of nitrate uptake, assimilation and 
conversion to glutamine and glutamate—were perturbed in most of 
the metabolic-mutant backgrounds queried. These perturbed genes 
include NPF6.3, NRT3.1, NIA1, NIR1, G6PD2, G6PD3, RFNR1, RFNR2, 
ASN1 and the transcription factor RAV2, found in this study (Fig. 5b). 
Another cluster includes known nitrogen-associated genes TGA1, 
NLP7, CIPK8, NRT2.1 and GDH2 (Fig. 5c). ANR1 is found in a cluster 
of transcription factors identified in this study, ERF107, ARF18 and 
BBX 16, which are perturbed in the mutant of NLP7 and the double 
mutant of TGA1/TGA4 (Fig. 5d). Similar transcriptional-regulation 
feedback on several of the transcription factors characterized in this 
study (RAV2, ERF107, ARF18 and BBX 16), in addition to previously 
established nitrogen-status regulators (LBD38, LBD39, TGA1, NLP7 
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Fig. 5 | Transcriptional feedback upon genetic perturbation of 
nitrogen metabolism or regulation. a, Bar graph showing differentially 
expressed genes within the network in different mutant backgrounds. 
The y axis is the number of differentially expressed genes based on 
genotype (metabolism mutants) or genotype-by-nitrate condition 
(transcription-factor mutants). An asterisk indicates significance at 
P<0.01 for enrichment in the network using a two-sided Fisher's exact 
test (see Methods). n = 18 expression datasets. SUPRD #7 and SUPRD 
#14 refer to dominant-repression mutant lines of the NLP6 gene. b, A 
core set of nitrogen metabolism genes and regulators that are robust 


and ANR1), further emphasizes the importance of these transcription 
factors as central nitrogen regulators. 

The YNM indicates the interconnected regulation of nitrogen 
metabolism: the more important a transcription factor is in regulating 
growth, the more likely it is to bind to promoters of genes in multiple 
nitrogen-related categories. The 21 transcription factors we describe 
here regulate diverse aspects of RSA and shoot development that 
contribute to how growth is regulated in different nitrogen environ- 
ments. Transcriptional feedback within the YNM revealed a core set 
of enzymes involved in nitrogen metabolism, and their regulators. The 
mechanisms underlying this feedback remain to be determined and 
may include signalling, metabolite and/or allosteric feedback, or the 
action of the NPF6.3 transceptor. The identification of these genetically 
regulated gene expression modules places the genes found in this study 
within the existing nitrogen-regulatory framework. The transcription 
factors we identify—in addition to the ‘core’ set of enzymes involved in 
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targets of transcriptional feedback. Clusters were determined using 
k-means clustering. Blue, gene significantly differentially expressed; white, 
gene not significantly differentially expressed using a two-sided test with 
limma (false discovery rate < 0.05, see Methods). ¢, d, Distinct clusters of 
transcription factors and metabolic enzymes that contain transcription 
factors found in this study and are targets of feedback by transcriptional 
regulators of nitrogen metabolism. Each study (see ‘Sources for mutant 
alleles’ in Methods) was analysed individually to test the effect of nitrate 
in the different mutants. For details of the studies from which mutants are 
derived, see ‘Sources for mutant alleles’ in Methods. 


nitrogen metabolism—will assist in breeding efforts to generate plants 
that use nitrogen more efficiently. 


Online content 

Any methods, additional references, Nature Research reporting summaries, source 
data, statements of data availability and associated accession codes are available at 
https://doi.org/10. 1038/s41586-018-0656-3. 
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METHODS 


No statistical methods were used to predetermine sample size. Seeds were rand- 
omized within each experiment and investigators were blinded to allocation during 
experiments and outcome assessment. 

Promoter cloning, yeast transformation and yeast one-hybrid assays. Gene 
promoters were cloned to 2 kb or until the nearest upstream gene or synthesized 
by Life Technologies (Supplementary Table 2a). In the case of cloning, promoters 
were amplified from Col-0 genomic DNA using Phusion Taq polymerase (NEB). 
Promoters were recombined into 5’/TOPO (Invitrogen), fully sequenced and then 
recombined into pMW2 and pMW37*! using LR clonase II (Invitrogen). pMW2 
and pMW3 constructs were sequence-confirmed. They were transformed into 
the yeast strain YM4271 as previously described*”. If constructs were resistant to 
transformation, they were transformed into the Y1H-S2 strain”. Yeast colonies 
were screened for autoactivation and construct presence. Promoter strains were 
mated against transcription-factor strains as previously described”). 
Transcription-factor cloning and yeast transformation. Transcription fac- 
tors were cloned from root RNA extracted using the RNeasy Kit (Qiagen) 
(Supplementary Table 2b). Coding sequences were amplified using Phusion 
Taq polymerase (NEB). Transcription factors were recombined into D-TOPO 
(Invitrogen), fully sequenced and then recombined into pDEST-AD2,: using LR 
clonase II (Invitrogen). They were transformed into the yeast strain Ya1867 as 
previously described”)”. 

Network construction. Networks were made using Cytoscape v.3.2.0°°. All 
cytoscape network files can be found at https://github.com/agaudinier/ 
Gaudinier2018. 

Figure construction. Figures were made using Cytoscape, and ggplot2* v.3.0.0 
in R. Figures were compiled using Inkscape (http://www.inkscape.org). 

Plant material and growth conditions. Transfer DNA (tDNA) mutant lines were 
obtained through TAIR (http://www.arabidopsis.org) or collaborators. Seeds sorted 
between 250-300 jm were surface-sterilized using dichloroisocyanuric acid solu- 
tion (0.9% (w/v) dichloroisocyanuric acid solution (10% water, 90% ethanol), then 
rinsed twice in 95% ethanol, and then dried completely). For the root mutant 
phenotyping experiment (Supplementary Table 2c), sets of four tDNA lines anda 
Col-0 control were plated in a random block design on a minimum of twelve 1-mM 
KNO; and twelve 10-mM KNO; medium plates, and stratified at 4°C for two 
nights. Medium components: 1 or 10 mM KNO3, 4mM MgSOug, 2 mM KH2PO,, 
1mM CaCh, 10 mM KCl, 36 mg/l FeEDTA, 0.146 g/l 2-morpholinoethane sulfonic 
acid, 1.43 mg/l H2BOs3, 0.905 mg/l MnCl-4H0, 0.055 mg/l ZnCL», 0.025 mg/l 
CuCl»-2H20, 0.0125 NazMoO4-2H20, 1% sucrose, 0.75% phytagel, pH 5.7. 

For the shoot phenotyping experiment, sorted seeds were stratified at 4°C for 
two nights and sown on Sunshine Mix soil in flats containing 18 pots. Seventeen 
genotypes, plus Col-0, were randomized in a partial random block design for 8 
or 9 biological replicates per experiment for a total of three experiments. Plants 
were watered twice a week, switching between a modified Hoagland’s solution and 
deionized water. Modified Hoagland’s solution components [16 x]: 1.6 g/l KNOs, 
0.55 g/l KH2POx, 3.85 g/l MgSOu, 3.57 g/l KCI, 2.35 g/l CaCh, 1.34 g/l Sprint 
330, 2.97 mg/l H3BOs, 3.17 mg/l MnCl,-4H20, 4.6 mg/l ZnSO4-7H20, 0.4 mg/l 
CuSO4:5H20, 0.39 mg/l H2,Mo0Oy,-H20, pH 5.5. 

For the RNA-sequencing (RNA-seq) experiment to characterize gene expression 
in each mutant background, 200-300 seeds per plate were sown on Petri plates with 
medium containing 1 mM or 10 mM KNO3and nylon mesh, and stratified for two 
nights at 4°C. Two plates of seedlings per genotype were grown and combined for 
each biological replicate. Four biological replicates were grown per genotype and 
treatment. Roots of 9-day-old seedlings were collected from 6-7 h after sunrise 
and immediately frozen in liquid N>. 

RNA-seq library preparation and pooling of technical replicates. RNA-seq 
libraries were prepared following the BRAD-Seq DGE protocol*. Libraries were 
sequenced using the Ilumina HiSeq 3000 in SR50 mode. Two technical replicate 
libraries were created from each RNA sample and after assessing sufficient repro- 
ducibility, counts across technical replicates were pooled together. Pooling was 
performed by summing the counts for the same gene across equivalent replicates. 
The merged file was subjected to the same quality processing. The number of 
mapped reads for each biological replicate and correlation of replicates are found 
in Supplementary Table 14c, d. 

RNA-seq read processing and differential expression analysis. Before and after 
read processing, libraries were analysed with FastQC (http://www.bioinformat- 
ics.babraham.ac.uk/projects/fastqc/) to assess the quality of the sequences. We 
trimmed barcodes from raw reads using fastx-trimmer (http://hannonlab.cshl. 
edu/fastx_toolkit/index.html) with parameters: -f 9 -v -Q 33. This was followed by 
adaptor trimming and quality filtering was using reaper, from the Kraken Suite*® 
with options: -geom no-be -dust-suffix-late 10/ACTG -dust-suffix 10/ACTG- 
noqe -nnn-check 1/1 -qqq-check 33/10 -clean-length 30 -tri 20 -polya 5-bcq-late. 
Trimmed reads were mapped to the reference genome of A. thaliana (TAIR 10) 
using bowtie (-a—best-strata -n 1 -m 1 -p 4-sam-tryhard) with subsequent 
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conversion to BAM format using samtools*”. HTSeq-count was used to obtain 
raw counts*®. 

Differential gene expression analysis was done using limma’® in R/ 
Bioconductor*, with empirical weights estimated for each observation using the 
voomWithQualityWeights function. Quantile normalization was used to account 
for different RNA inputs and library sizes. The linear model for each gene was 
specified as: log(counts per million) of a particular gene = mutant + treatment + 
mutant:treatment. Specific contrasts were constructed to compare each mutant to 
the control, and each genotype x treatment interaction. Differentially expressed 
genes were selected based on a false discovery rate < 0.05 (Supplementary 
Table 15). 

Generation of the publically available gene expression profiling datasets for 
nitrogen-responsive genes. We compiled a comprehensive dataset of publically 
available gene-expression responses of wild-type plants in response to nitrogen 
availability in both the root and shoot (GEO accession GSE18984)710121519.41-45, 
as well as profiling of nitrogen-status gene expression changes in specific root 
cell types’. Data from ATH1 affymetrix arrays were downloaded from the NCBI 
GEO database“ and imported into R using the affy*” package in Bioconductor. 
Arrays were normalized using the robust multi-array average (RMA) method. 
Gene expression was averaged across biological replicates, and then treatment data- 
sets were expressed relative to their appropriate controls (Supplementary Table 5). 
Pearson and Spearman correlations were calculated in R for the treatment and 
cell-type-specific datasets for all transcription factor-target pairs (Supplementary 
Table 6). 

Spearman rank correlation analysis of root and shoot phenotypes relative to 
network connectivity and related metrics. We prioritized transcription factors 
from the YNM with a Pearson or Spearman rank correlation greater than +0.5 (for 
the nitrogen treatment dataset) or greater than +0.8 (for the cell-type-specific data- 
set) with their target genes (Supplementary Table 6). Spearman rank correlations 
were calculated in R using rcorr() from the Hmisc package (https://cran.r-project. 
org/package=Hmisc) for the phenotype traits of transcription-factor mutants, 
relative to network connectivity and correlation with targets. Data and correlations 
can be found in Supplementary Table 13. 

Generation of the publically available dataset for nitrogen-metabolism mutants 
and mutants of transcription factors associated with nitrogen metabolism. 
Affymetrix arrays were read using the affy package in Bioconductor. Agilent and 
Complete Arabidopsis Transcriptome Micro Array (CATMA) arrays were read with 
the read.maimages() function from limma; the source option was set to ‘agilent’ 
for the former and ‘genepix’ for the latter. After arrays were read, limma was used 
for downstream processing, normalization and differential expression analysis. 
In brief, Affymetrix arrays were normalized using the RMA method. Agilent and 
CATMA arrays were subjected to background correction and normalization using 
the functions backgroundCorrect() and normalizeBetweenArrays(). After nor- 
malization and filtering, differential expression was analysed using the standard 
limma approach. 

NECorr. The starting hypothesis of NECorr is that an important interaction 
for a stimuli response is that of a regulator acting on one or several hub genes. 
Hence, hub genes will propagate the systemic cascade appropriate to the stimuli. 
Thenceforth the dynamics of the molecular network will evolve. This approach 
ranks transcription factors given several network metrics including betweenness 
centrality, degree distribution and as a function of their gene-expression simi- 
larity®®. 

Hub calculation. The first step is a heuristic model, which merges molecular net- 
work topology and gene-expression data. NECorr-Hub is a linear model including 
five parameters: condition or tissue specificity of gene expression, co-expression 
of interactions across conditions, and the molecular network centralities between- 
ness, connectivity and transitivity. The rank given to each of these parameters was 
decided empirically. 

Both genes of an interaction pair need to be co-expressed in most of the tissues 
and/or conditions, which shows that they can influence each other. Correlated 
gene expression was considered as the highest-ranking parameter, followed by 
gene-expression specificity in the studied tissue or condition. In addition, a high 
level of connectivity of the gene in the molecular network is required to generate 
a proper response. Connectivity can be defined in several manners: betweenness, 
degree connectivity and transitivity were chosen as the most meaningful central- 
ities to define gene importance as a hub. 

Based on the ranking, each parameter weight was estimated using the ana- 
lytic hierarchy process (AHP)*”°, a multiple-criteria decision analysis method. 
The AHP is applied through the R package pmr (https://cran.r-project.org/pack- 
age=pmr). The importance of the five parameters is generated by pairwise com- 
parisons. Hence, this leads to an adjacency matrix of pairwise weight importance. 
From this adjacency matrix, Eigen vectors are calculated to assign a weight to each 
parameter. The AHP method is applied as follows. Each gene is ranked for the 
five parameters above. Each ranked parameter is standardized in values between 
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0 and 1 (z-score), to obtain data with the same scale. For each tissue and/or con- 
dition, the parameter weights are applied as factors of a linear model that is used 
to prioritize hub genes. 

For condition 1: Rankingcondition! = W1 X IntSig + w2 x TS + w3 xBetC + w4 
x Cot + ws x Trs, in which IntSig represents the interaction significance (co- 
expression significance in the interaction involving the gene), TS represents tissue 
specificity (selectivity), BetC represents betweenness centrality, Cot represents 
connectivity centrality and Trs represents transitivity. The weights are defined as 
W1 = W2 > W3 > W4> Ws. 

To rank the interactions in the molecular network for a given condition, the 

average ranking of the two genes defining this edge is taken. When several condi- 
tions are evaluated, the gene ranking between conditions can be done by averaging 
each condition ranking. 
NECorr-Hub parameter estimation. Molecular network topology centralities are 
obtained using the R package iGraph (https://cran.r-project.org/package=igraph). 
Co-expression analysis (or the significance of each interaction) was estimated using 
a Rcpp script to evaluate the Gini correlation coefficient related to each interaction. 
The Gini correlation coefficient was previously shown to be an effective method 
for detecting transcription-factor activity°!. The co-expression significance for 
each gene is evaluated by averaging the magnitude of the correlation from all the 
interactions containing this particular gene using Fisher’s method™*?. 

The genes with tissue and/or condition specificity (or selectivity) are detected 
using the intersection-inion test (IUT) with a relaxed threshold (raw P= 0.5) 
(https://cran.r-project.org/package=igraph)**. The tissue and/or condition-selective 
genes or tissue and/or condition-excluded genes are assigned for each specific tissue 
and/or condition within a set of samples. These genes attributed to a tissue and/or 
condition are fuzzy owing to the low selection threshold of the gene in IUT; a gene 
could therefore appear in a different tissue or condition as selective or excluded. 
Second, these selected genes are ranked for their tissue selectivity or exclusion using 
the tissue specificity index”. We define both a positive and negative TSI: 


pa et = Ki] Xivae) 


positive TSI = 
N-1 


The negative tissue specific index measures the extent to which a gene is excluded 
from a tissue or condition: 


Bi (x; _ Xmin/ X max) 

N-1 
The results for TSI measurements are merged to obtain a ranking of all the tissue 
and/or condition-selective or -excluded genes defined from IUT test. 

NeCorr rankings can be found in Supplementary Table 7. 

Code. The NECorr source code is maintained in GitHub: https://github.com/ 
warelab/NECorr. 

Mutant line selection. The mutant lines acquired for this study represent most of 
the top-ranked and intermediate-ranked genes that were deemed interesting for 
having important binding targets (Supplementary Tables 10, 11). 

Root phenotyping data collection and analysis. Traits measured included PRL, 
LR and LRL. Additionally, composite traits were considered, including total root 
length (TRL=PRL + LRL), average lateral root length (ALRL=LRL/LR), lateral root 
density (LRD = LR/PRL) and the percentage of LRL contributing to TRL (LRL/TRL) 
as well as the partitioning of variation across these mutants relative to wild type, using 
principal component analysis of all RSA traits°* (Supplementary Data 1). 

Plates were scanned using the V750 scanner. Primary roots and lateral roots 

of 9-day-old seedlings were traced using a Wacom Bamboo tablet in Image]. 
Data were log-transformed and analysed using ANOVA in R. Using a two-way 
ANOVA, three phenotypic categories were considered: genotype effects in both 
nitrogen conditions (genotype-dependent), genotype effects in only one condition 
(nitrogen-condition-dependent) or genotype by nitrogen condition-dependent 
effects (P < 0.05, Supplementary Data 2). The extent to which lateral root traits 
are uncoupled from PRL is not clear®’, thus an additional ANOVA model was used 
that included PRL as a factor—with the hypothesis that lateral root emergence or 
elongation may be dependent on PRL. As expected, composite traits extracted from 
the principal component analysis were significantly correlated with a number of 
RSA traits (P < 0.05, Extended Data Fig. 4, Supplementary Table 11). The scripts 
for analyses can be found at https://github.com/agaudinier/Gaudinier2018. All 
ANOVA tables can be found in Supplementary Data 2. A summary of the statistics 
can be found in Supplementary Table 13. 
Principal component analysis. Mutant and wild-type controls were plotted in R 
using the prcomp() function. The loadings for each principal component (PC1- 
PC3) in the mutant and wild-type sets were analysed using ANOVA in R. The script 
for the analysis can be found at https://github.com/agaudinier/Gaudinier2018. All 
ANOVA tables can be found in Supplementary Data 2. 


negative TSI = 


Shoot phenotyping data collection and analysis. Plants were photographed at 
15 and 22 days old. Image] was used to analyse rosette size. Bolting and flowering 
days were recorded. Rosette-size data were log-transformed, and for bolting and 
flowering day a reciprocal transformation was used and analysed using a two-way 
ANOVA in R. All ANOVA tables can be found in Supplementary Data 2. A sum- 
mary of the statistics can be found in Supplementary Table 13. 

Chlorophyll extraction and analysis. Full rosette leaves were measured for their 
chlorophyll content index using the CCM-200 plus (Opti-Sciences). Chlorophyll 
measurements were done by collecting supernatants of discs from nlp7-1, chl1-5 
and Col-0 (control) leaves extracted in two extractions of 80% HEPES-buffered 
ethanol heated to 80°C and one extraction of 50% HEPES-buffered ethanol. 
Absorbance for the supernatant was measured at 652 and at 665 nm. Total chlo- 
rophyll was calculated as chlorophyll = 22.12 Agso + 2.71 Ages, according to a pre- 
viously published method”. 

Quantification of !°N and °C abundance. Rosettes of 20-day-old plants were col- 
lected and dried at 60°C for two days. Dried rosettes were homogenized. Samples 
of 0.7-3 mg were submitted to the Stable Isotope Facility at University of California 
at Davis for analysis of natural abundance levels of !°N and °C using an elemental 
analyser with a continuous flow isotope ratio mass spectrometer. 

YNM network analyses. Genotype expression and expression dependent on genotype- 
by-nitrate condition. To test for the presence of genes in the YNM that are signifi- 
cantly differentially expressed, a Fisher’s exact test was used to test for enrichment 
in R, using the standard function fisher.test(). For this, we queried whether the 
overlap of YNM-predicted genes that overlap with differentially expressed genes 
was greater than differentially expressed genes that did not overlap with YNM 
genes; as a background, we used genes that were not differentially expressed. 
We performed this test for every contrast, which means that the groups of genes 
changed between each test but the absolute number of total genes remained the 
same. 

To test for enrichment of various pathways, a list of CPK-NLP7-dependent 
genes®, a list of primary-root developmental genes (Supplementary Table 12) and 
lateral-root developmental genes”, and a list of hormone-responsive genes” were 
queried. A Fisher’s exact test was used to determine whether the proportions of 
genes from these datasets were enriched in the YNM. The background for the 
CPK-NLP7 test includes all genes in the Arabidopsis genome that are not part of 
the YNM. The background for the root development and hormone tests includes 
all genes on the ATH] affymetrix microarray that are not part of the YNM. 
Transcriptional feedback of nitrogen metabolism enzymes and regulators. To test 
whether any feedback is present within the YNM, we curated whole-genome 
expression datasets in mutants of nitrate transporters or a transceptor, or met- 
abolic enzymes (GEO accession GSE10786)**-© (NPF6.3, the double mutant 
of NIA1/NIA2, GLU1, NRT2.4 and the triple mutant of GDH1/GDH2/GDH3 
(Supplementary Table 16)). The expression of genes in mutants of previously 
described transcriptional regulators of nitrogen metabolism (ANR1, NLP6, 
NLP7, BZIP1 and the double mutants of TGA1/TGA4 and NLP6/NLP7) that 
show changes in gene expression dependent on genotype-by-nitrogen condition— 
relative to wild type—in a genotype-by-condition analysis was also considered 
(GEO accession GSE6824)!315-!93 (Supplementary Table 16). 

Enrichment was calculated as above. We tested whether the overlap between 

differentially expressed genes within the YNM was greater than differentially 
expressed genes which did not overlap with genes in the YNM. Each microarray 
study was analysed independently. 
Clustering analysis of transcriptional feedback on YNM. k-means clustering 
was performed using the presence or absence calls of significantly differentially 
expressed genes that were differentially expressed in at least one of the contrasts; 
if a gene predicted by the network was significantly differentially expressed, we 
assigned it a value of 1, and if it was not significantly differentially expressed we 
assigned it a value of 0. Genes with 0 across all contrasts were not considered for 
the analysis (not differentially expressed in any condition). In short, the algorithm 
used was to calculate the Euclidean distance of the binary matrix (dist function in 
R), then to obtain the principal components of this distance using a correlation 
matrix (princomp function in R with cor = T) and select the scores for the first 
two components. We then calculated the clusters based on these two first principal 
components, and a selected value for k. 

The number of clusters (k) was selected by analytical and empirical analysis: 
the ‘elbow method’ looks at different values for k and their relationship with the 
within-cluster sum of squares, in which the optimal value is that at which the line 
starts to plateau. We then tested different values for k within a threshold given by 
the elbow method and selected that in which biologically relevant clusters were 
observed. 

Identification of the dominant pattern in transcription-factor mutants of 
nitrogen-responsive genes in the root. Data in each mutant background 
was filtered for genes that were significantly differentially expressed owing to 
nitrogen treatment in wild-type plants (wild type 1 mM versus wild type 10 mM) 
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(Supplementary Table 18). The expression of each of these genes was obtained 
by taking the difference in log(FC) between the effect of nitrogen on each of the 
mutants (that is, arf18-31 mM versus arf18-3 10 mM) and the wild type (wild 
type 1 mM versus wild type 10 mM). Dominant patterns of expression were then 
identified using a previously published algorithm (with parameters set as fol- 
lows: minExpFilter = FALSE; min VarFilter = FALSE; fuzzykmemb = 1.04; already- 
Log2 = TRUE). The choice of number of clusters was set to kChoice = 7. Clustering 
of genes, the expression of which changes upon variation in nitrogen availability 
in the wild-type root, revealed that these mutants have similar perturbations in 
nitrogen-associated gene regulation (Extended Data Fig. 9). 

Sources for mutant alleles. The sources for the mutant alleles displayed in Fig. 5 
are as follows: tga1/tga4 (ref. 15). nrt2.4 (GEO accession GSE10786); chl1-12, chl1-5 
(1) and chl1-9 (ref. °°); nlp7-1 (1) (ref. "); anr1 (GEO accession GSE6824); nial/ 
nia2 (ref. °'); chl1-5 (2) (ref. “4); glu1-2 leaf and glu1-2 root (ref. ©); NLP6 SUPRD 
#7 and NLP6 SUPRD #14 (ref. '°); nlp7-1 (2), nlp7-3 and nlp6/nlp7 (ref. *); bzip1-1 
(ref. 1°); and gdh1/gdh2/gdh3 (ref. ®). 

Code availability. Code for plant phenotyping analysis can be found at https:// 
github.com/agaudinier/Gaudinier2018. Code for NeCorr analysis can be found at 
https://github.com/warelab/NECorr. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 

RNA sequencing data that support the findings of this study have been deposited 
in NCBI with the primary accession code GSE107988. Supplementary Tables, 
R code and Cytoscape files can be found at: https://www.bradylab.org/resources/ 
or https://github.com/agaudinier/Gaudinier2018. 
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Extended Data Fig. 1 | Combinatorial interactions between 
transcription factors and promoters of genes associated with nitrogen 
metabolism, signalling and nitrogen-associated processes. Rectangles, 
promoters; ovals, transcription factors; diamonds, genes represented as 
both promoters and transcription factors. Nitrogen-associated biological 
processes are indicated by promoter colour. A grey line indicates 


a transcription factor—-promoter interaction. Light green, nitrogen 
transporter; yellow, organ growth; dark green, nitrate assimilation; light 
purple, nitrogen signalling; light blue, nitrogen-linked; orange, carbon 
metabolism; red, ethylene; dark blue, auxin; teal, carbon transporter; 
dark purple, amino acid metabolism; pink, transcription factors linked to 


nitrogen. 
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CPK-NLP7 Abscisic Acid 
p-value = 2.14E-09 p-value = 1.04E-09 


1-amino-cyclopropane-1-carboxylic acid Methyl Jasmonate 
(Ethylene) p-value = 2.43E-13 


p-value = 2.44E-82 


iii 


6aaee@ 0 


Indole-3-acetic acid(Auxin) Zeatin (Cytokinin) 
p-value = 3.72E-10 p-value = 0.000278 


Brassinolide (Brassinosteroid) Gibberellic Acid 3 (Gibberellin) 
p-value = 0.000143 p-value = 0.00186 


ethylene (red). d, Genes regulated by methyl jasmonate (orange). e, Genes 
regulated by auxin (dark blue). f, Genes regulated by cytokinin (light 
blue). g, Genes regulated by brassinosteroid (green). h, Genes regulated by 
gibberellic acid (pink). Gene lists used for enrichment tests can be found 


Extended Data Fig. 2 | Genes in the YNM regulated by hormone 
signalling. The YNM. Genes coloured in each panel are regulated by the 
CPK-NLP7 signalling cascade or indicated hormone. P value indicates 
significance for enrichment in the network using a two-sided Fisher’s 
exact test. a, Genes regulated by the CPK-NLP7 signalling cascade in Supplementary Table 4. 
(cyan). b, Genes regulated by abscisic acid (purple). c, Genes regulated by 
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Extended Data Fig. 3 | Wild-type root growth. RSA for wild-type (Col-0) 
nine-day-old seedlings in both limiting (1 mM) and sufficient (10 mM) 
KNO; conditions. a-g, Traits measured were primary root length (a), 
number of lateral roots (b), total lateral root length (c), average lateral root 
length (d), total root length (e), lateral root density (f) and the ratio of 


Ratio of Lateral Root Length to Average Lateral Root Length (cm) 
oO 
oO 


Total Root Length (LRL/TRL) 


lateral root length contributing to the total root length (g). Box plots are 
centred at the data median and mark from the 25th to the 75th percentile. 
Individual measurements are plotted as black dots. n =209 1 mM KNOs, 
n=201 10 mM KNOs, P values were calculated using two-way ANOVAs. 
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grown on 1 mM KNO;_a, PC] captures 69% of the variation and PC2 KNOs). 
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Extended Data Fig. 5 | YNM sub-network involved in nitrogen- root development. Heavy black borders denote genes with a mutant root 
associated influence on RSA. a, The YNM. Blue, genes associated with phenotype from this study. b, Sub-network of YNM with genes associated 


root length (Supplementary Table 10); yellow, genes associated with lateral with RSA, and their first neighbour connections. 
root development”’; green, genes associated with root length and lateral 
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a two-way ANOVA; exact n and P values for the analysis can be found in 
Supplementary Table 10. Box plots are centred at the data median and 
mark from the 25th to the 75th percentile. Individual measurements are 


plotted as black dots. 


Extended Data Fig. 6 | Nitrogen, carbon and carbon:nitrogen ratio in 
transcription-factor mutants. a, Percentage of natural abundance of °N 
in total shoot tissue. b, Percentage of natural abundance of BC in total 
shoot tissue. c, Ratio of natural abundance of !C to °N. *P < 0.05 using 
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Extended Data Fig. 7 | Chlorophyll levels across transcription-factor 
mutants. a, Chlorophyll levels measured by chlorophyll content index. 

b, Total chlorophyll levels measured by ethanol extraction. *P < 0.05 using 
a two-way ANOVA; exact n and P values for the analysis can be found in 


Supplementary Table 10. Box plots are centred at the data median and 
mark from the 25th to the 75th percentile. Individual measurements are 
plotted as black dots. 
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Extended Data Fig. 8 | Clustering of nitrogen-responsive genes in the background was expressed as the log,(fold change) of the expression of 
root, in transcription-factor mutants. The expression in the root of genes _a given gene in 1 mM nitrate relative to 10 mM nitrate, and relative to 
responsive to nitrogen availability (Supplementary Table 15) was analysed _its expression in wild type (log,(fold change) in 1 mM nitrate relative to 
in the mutant background of each transcription factor, and clustered 10 mM nitrate. Colours on the y axis indicate each respective cluster or 
using dominant pattern identification. Gene expression in each mutant module. Gene names are indicated on the far right. 
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Extended Data Fig. 9 | Clusters of YNM genes in mutants of enzymes analysis of nitrogen-metabolism mutants and nitrogen transcriptional 
involved in nitrogen metabolism and their transcriptional regulators. regulator mutants. b, Clusters overlaid on the YNM. 
a, Clusters of genes significantly differentially expressed in the microarray 
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Extended Data Fig. 10 | Differentially expressed genes in the YNM of mutant datasets in which they are found to be differentially expressed 


in mutants of enzymes involved in nitrogen metabolism, and their (white = 0, dark purple = 10). 
transcriptional regulators. The YNM. Genes are coloured by the number 


© 2018 Springer Nature Limited. All rights reserved. 


LETTER 


https://doi.org/10.1038/s41586-018-0658-1 


The SWI/SNF complex is a mechanoregulated 


inhibitor of YAP and TAZ 


Lei Chang)®, Luca Azzolin)®, Daniele Di Biagio!®, Francesca Zanconato!, Giusy Battilana’, Romy Lucon Xiccato!, 
Mariaceleste Aragona!, Stefano Giulitti!, Tito Panciera!, Alessandro Gandin?, Gianluca Sigismondo’, Jeroen Krijgsveld?, 


Matteo Fassan*, Giovanna Brusatin?, Michelangelo Cordenonsi!”* & Stefano Piccolo 


Inactivation of ARID1A and other components of the nuclear 
SWI/SNF protein complex occurs at very high frequencies in a 
variety of human malignancies, suggesting a widespread role 
for the SWI/SNF complex in tumour suppression. However, the 
underlying mechanisms remain poorly understood. Here we show 
that ARID1A-containing SWI/SNF complex (ARID1A-SWI/ 
SNF) operates as an inhibitor of the pro-oncogenic transcriptional 
coactivators YAP and TAZ?. Using a combination of gain- and loss- 
of-function approaches in several cellular contexts, we show that 
YAP/TAZ are necessary to induce the effects of the inactivation 
of the SWI/SNF complex, such as cell proliferation, acquisition of 
stem cell-like traits and liver tumorigenesis. We found that YAP/ 
TAZ form a complex with SWI/SNF; this interaction is mediated 
by ARID1A and is alternative to the association of YAP/TAZ with 
the DNA-binding platform TEAD. Cellular mechanotransduction 
regulates the association between ARID1A-SWI/SNF and YAP/ 
TAZ. The inhibitory interaction of ARID1A-SWI/SNF and YAP/ 
TAZ is predominant in cells that experience low mechanical 
signalling, in which loss of ARIDIA rescues the association between 
YAP/TAZ and TEAD. At high mechanical stress, nuclear F-actin 
binds to ARID1A-SWI/SNE, thereby preventing the formation 
of the ARIDIA-SWI/SNF-YAP/TAZ complex, in favour of an 
association between TEAD and YAP/TAZ. We propose that a dual 
requirement must be met to fully enable the YAP/TAZ responses: 
promotion of nuclear accumulation of YAP/TAZ, for example, by 
loss of Hippo signalling, and inhibition of ARID1A-SWI/SNE, 
which can occur either through genetic inactivation or because of 
increased cell mechanics. This study offers a molecular framework 
in which mechanical signals that emerge at the tissue level together 
with genetic lesions activate YAP/TAZ to induce cell plasticity and 
tumorigenesis. 

Organs must have tissue-level checkpoints to preserve cell fates, 
repair wounds and avoid cancer. The highly related transcriptional 
regulators YAP and TAZ have recently emerged as a fundamental 
sensor through which cells read structural and architectural features of 
their tissue microenvironment using mechanotransduction>. Although 
YAP is sufficient to trigger several hallmarks of cancer, the normal 
microenvironment of adult tissues inhibits YAP/TAZ, such that emer- 
gence of a solid tumour must include the successful combination of 
YAP/TAZ activation and removal of YAP/TAZ inhibitors. 

We set out to identify the nuclear factors that interact with YAP/TAZ 
using chromatin immunoprecipitation followed by mass spectrometry’, 
as the regulation of YAP/TAZ in the nuclear compartment has so far 
been largely overlooked in comparison to the available knowledge on 
YAP/TAZ regulation in the cytoplasm*». The association of YAP/TAZ 
with several components of the SWI/SNF chromatin-remodelling com- 
plex attracted our attention (Extended Data Fig. 1a and Supplementary 
Table 1). The SWI/SNF complex contains a core ATPase involved in 


1,557 


nucleosome remodelling, either BRG1 or BRM, and other co-factors, 
such as ARID1A, the function of which is less well understood!. YAP/ 
TAZ associated with ARID1A, but not with ARID1B (Supplementary 
Table 1), which are known to define alternative SWI/SNF complexes!. 
In several co-immunoprecipitation experimental set-ups, we found 
YAP in complex with ARID1A, BRG1, BRM and other components of 
the SWI/SNF complex (Fig. la-c and Extended Data Fig. 1b, c), also 
in the absence of chromatin (Extended Data Fig. 1f). 

YAP associates with SWI/SNF through ARID1A. Indeed, depletion 
of ARID1A, but not of ARID 1B, impaired the ability of YAP to be incor- 
porated into BRG1- or BRM-containing SWI/SNF complexes (Fig. 1b 
and Extended Data Fig. Ic, g). Conversely, depletion of BRM (also 
known as SMARCA2) and BRGI1 (also known as SMARCA4) did not 
affect the association between endogenous ARID1A and YAP proteins 
(Fig. 1c). Purified recombinant YAP (or TAZ) and ARID1A proteins 
directly interact in vitro (Extended Data Fig. 1h) through physical 
association of their WW domain and PPxY motifs, respectively® 
(Extended Data Fig. 1i-k). 

We next assessed the functional relevance of SWI/SNF for YAP- 
dependent transcription. SWI/SNF inactivation caused induction of 
the YAP-TEAD luciferase reporter (8x GTIIC)” through activation 
of endogenous YAP/TAZ (Fig. 1d) and also strongly enhanced the 
activity of co-transfected exogenous wild-type YAP (Extended Data 
Fig. 2a). By contrast, the WW-mutant YAP was insensitive to SWI/ 
SNF depletion (Extended Data Fig. 2b). SWI/SNF inactivation by 
depletion of ARID1IA (but not ARID 1B) also induced the expression of 
several direct target genes of YAP/TAZ, in a manner that is rescued by 
concomitant YAP/TAZ depletion (Fig. le and Extended Data Fig. 2c, e). 
Of note, SWI/SNF depletion neither affects the subcellular localiza- 
tion of YAP/TAZ nor the phosphorylation level and stability of YAP 
(Extended Data Fig. 2f, g); therefore, SWI/SNF acted downstream of 
the classic modality of YAP/TAZ regulation that is dictated by Hippo 
kinases”. Taken together, these findings indicate that SWI/SNF directly 
binds and inhibits nuclear YAP/TAZ, thus representing a new layer of 
YAP/TAZ regulation. 

Our findings raised the possibility that YAP/TAZ regulation may 
contribute to SWI/SNF tumour-suppressive functions. There is indeed 
a remarkable overlap between the biological effects of YAP/TAZ acti- 
vation and of SWI/SNEF inactivation, including control of cell fate plas- 
ticity, gain of stemness properties and tumorigenesis!*"". It is known 
that loss of SWI/SNF triggers the epithelial to mesenchymal transition 
and induces the gain of stem/progenitor-like properties in immortal- 
ized human mammary epithelial cells (HMECs)*. Notably, we found 
that activation of endogenous YAP/TAZ mediates the consequences of 
BRGI or ARIDIA inactivation in these cells (Fig. 2a, b and Extended Data 
Fig. 3a-g). Therefore, SWI/SNF is a critical barrier that prevents the acti- 
vation of endogenous YAP/TAZ; loss of control of this pathway in HMECs 
promotes YAP/TAZ-driven induction of stem/progenitor-like properties. 


1Department of Molecular Medicine, University of Padua, Padua, Italy. @Department of Industrial Engineering and INSTM, University of Padua, Padua, Italy. ?German Cancer Research Center 
(DKFZ) and Heidelberg University, Heidelberg, Germany. “Department of Medicine (DIMED), Surgical Pathology and Cytopathology Unit, Padua, Italy. 5IFOM, The FIRC Institute for Molecular 
Oncology, Padua, Italy. These authors contributed equally: Lei Chang, Luca Azzolin, Daniele Di Biagio. These authors jointly supervised this work: Michelangelo Cordenonsi, Stefano Piccolo. 


*e-mail: michelangelo.cordenonsi@unipd.it; piccolo@bio.unipd.it 
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Fig. 1 | YAP interacts with SWI/SNF trough ARIDIA. a-c, Top, 
co-immunoprecipitation experiments. Bottom, schematics of the 
corresponding experimental results. See also Extended Data Fig. 1d, e. 

a, Endogenous ARID 1A (top band) binds to endogenous YAP and 

BRGI1 in co-immunoprecipitation experiments in MCF10AT cells. IP, 
immunoprecipitation. b, Binding of YAP to BRM requires ARID1A in 
HEK293T cells. c, Binding of YAP to ARID1A does not require BRG1 and 
BRM. d, Luciferase assay using the 8x GTIIC-Lux reporter in HEK293 
cells transfected as indicated. e, CTGF expression in MCF10A cells 
transfected with the indicated siRNAs. These inductions occurred without 
triggering epithelial to mesenchymal transition (Extended Data Fig. 2d). 
d, e, Data are mean + s.d. of n = 3 biologically independent samples; 

P values were determined by unpaired two-sided Student's t-test. 
Representative experiments are shown, which were repeated 
independently two (a-c) or three (d, e) times, all with similar results. 


In Drosophila neuroblasts, the SWI/SNF complex acts as tumour 
suppressor, because it prevents cellular dedifferentiation back to a neu- 
ral stem-cell (NSC) state'!. However, we found that the sole depletion 


of Brm (also known as Smarca2) or Arid 1a is insufficient to trigger 
a change of fate in cultures of fetal mouse hippocampal neurons 
(Extended Data Fig. 4a—c), possibly because mammalian neurons do 
not express the specific factors required for their dedifferentiation. We 
previously reported that fetal mouse hippocampal neurons are devoid 
of endogenous YAP expression and that expression of exogenous YAP 
is sufficient to convert these cells into NSC-like cells'*. Notably, inac- 
tivation of Brm or Arid1a by short hairpin RNA (shRNA), or genetic 
deletion of Arid1a, strongly potentiated the reprogramming of YAP- 
expressing neurons into NSCs (Fig. 2c and Extended Data Fig. 4b-e). 
Thus, YAP/TAZ are central for executing key biological responses 
downstream of SWI/SNF inactivation. 

Next, we validated the role of ARIDI1A-SWI/SNF as a nuclear 
inhibitor of YAP/TAZ in vivo. Overactivation of YAP in the liver (for 
example, downstream of inactivation of the Hippo pathway by knock- 
ing out Nf2) leads to YAP-driven tumorigenesis but only after a long 
period of latency’*"“, suggesting that additional genetic or epigenetic 
events must be in place to induce the tumorigenic potential of YAP. We 
hypothesized that removal of the SWI/SNF complex might be one of 
these events. We used mice bearing tamoxifen-inducible Cre recom- 
binase under the control of the hepatocyte-specific albumin promoter 
(Alb8™) to induce genetic ablation of Nf2 and Arid1a in adult hepat- 
ocytes (liver knockout (LKO)) (Extended Data Fig. 5a, b). YAP nuclear 
staining was clearly induced around the portal areas of Nf2 LKO mice 
(Extended Data Fig. 5c). In spite of this, only a modest induction of 
transcriptional activity of YAP/TAZ and moderate phenotypic effects 
were observed (that is, ductular reactions with few proliferating cells) 
but no tumours developed up to four months after Cre activation 
(Fig. 2d, e and Extended Data Fig. 5d, e). Instead, at the same time 
point, all mice with combined knockout of Nf2 and Arid1a exhibited 
liver overgrowth (Fig. 2d), with widespread areas of neoplasia, includ- 
ing full-blown cholangiocarcinomas and hepatocellular carcinomas 
(Fig. 2e). An extensive degree of proliferation was evident in tumours 
and across the remaining hepatocytes (Extended Data Fig. 5d), also 
including the hundreds-fold induction of the fetal/tumour marker 
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Fig. 2 | Loss of SWI/SNF promotes YAP/TAZ-driven biological effects. 
a, Depletion of BRG1 in HMECs causes changes in the expression of 
YAP/TAZ target genes (CTGF and PTX3) and the indicated markers for 
mesenchymal transition (ZEB1 and CDH2 (which encodes N-cadherin)) 
and epithelial differentiation (CDH1 (which encodes E-cadherin) and 
TP63 (which encodes ANP63)), in a TAZ-dependent manner. Data are 
mean + s.d. of n=3 biologically independent samples. b, Mammosphere 
formation assay (which measures stem/progenitor-like properties) of 
HMECs transduced as indicated. Data are mean + s.d. of n=6 biologically 
independent samples. c, Neurospheres emerging from neurons infected 
with inducible YAP-encoding vectors and the indicated shRNA-encoding 
lentiviral vectors. As a negative control, we used a transcriptionally inactive 
version of YAP (YAP(S94A)). Scale bar, 300 jum. See also Extended Data 
Fig. 4b, c. d, Gross liver images and liver-to-body weight ratio from control 
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mice (n=6 mice), and Nf2 (n=6 mice), Aridla (n=7 mice) and Nf2/ 
Aridla (n=7 mice) liver knockout (LKO) mutant mice, four months after 
tamoxifen treatment. Scale bars, 1 cm. Data are mean + s.d. All animals 
were included. e, Liver sections from mice described in d were stained 

with haematoxylin and eosin. HCC, hepatocellular carcinomas; iCCA, 
intrahepatic cholangiocarcinomas. Scale bar, 100m. f, Haematoxylin and 
eosin-stained liver sections from control (n= 10), Aridla LKO (n= 12), and 
Arid1a/Yap/Taz LKO (n= 15) mice treated with tamoxifen and then fed a 
DDC diet, compared to control mice fed a normal diet (n = 10). Scale bar, 
100m. P values were determined by unpaired two-sided Student's t-test (a, b) 
and one-way analysis of variance (ANOVA) with Tukey’s multiple 
comparisons test (d). a—c, e, f. Representative experiments are shown, which 
were repeated independently three (a, b) or four (c) times and of all mice (e, f), 
all with similar results. 
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Fig. 3 | Mechanical regulation of the association of YAP/TAZ with the 
SWI/SNF complex or TEAD. a, Visualization of Flag-tagged NLS-(-actin 
filaments in nuclei of HEK293T cells using anti-Flag immunofluorescence. 
Scale bar, 10,1m. No nuclear actin filaments were detected in cells 
transfected with a non-polymerizable variant of actin (see Extended 
Data Fig. 6a). b, Representative pictures of PLA detecting the interaction 
between endogenous BRM and Flag-tagged NLS-(-actin in the nucleus 
of HEK293T cells, which experienced high mechanical inputs (that is, 
spread cells; 94.7% PLA-positive) or confined on a small adhesive area 
(in a dense culture’; 0% PLA-positive). See specificity controls using 
BRM and BRG1 siRNA (Extended Data Fig. 6b). c, Biotinylated phalloidin 
pull-down experiments using HEK293T cells, comparing phalloidin 
(Phall) and latrunculin A (Lat-A) treatment. See Methods and Extended 
Data Fig. 6d. Gelsolin serves as specificity control for purification of 
F-actin. d, In latrunculin-A-treated MCF10AT cells, endogenous YAP 
binds to SWI/SNF in an ARID1A-dependent manner. See also Extended 
Data Fig. 6f. ARID1A was loaded on a separate gel. e, Representative 
PLA images detecting the interaction between endogenous BRM and 

a version of YAP forced to enter the nucleus (NLS-YAP) in MCF10A 
cells. Cells were allowed to stretch over rigid ECM (high mechanics; 0% 
PLA-positive) or, for low mechanical experiments, allowed to adhere to a 
small area (100 jum’; 11.25% PLA-positive) or treated with latrunculin A 
(18.75% PLA-positive). See also Extended Data Fig. 6g. f, Representative 
PLA images detecting the interaction between endogenous TEAD and 
NLS-YAP in MCF10A cells, stretched over a rigid ECM (see Methods) 
or experiencing low cell mechanics by adhesion to a small adhesive area, 
soft ECM or treatment with latrunculin A or anti-integrin-61 antibodies. 
The YAP-TEAD association (high mechanics; 46.3% PLA-positive) is 
lost under low mechanical conditions (low mechanics; 0% PLA-positive), 
but is rescued after depletion of ARID1A (PLA-positive: small, 44%; soft, 
49.8%; latrunculin A, 48.5%; anti-integrin, 67%). See also Extended Data 
Fig. 6j. All panels show representative experiments that were repeated 
independently three times. 


Afp. Notably, YAP/TAZ transcriptional activity was strongly induced 
in livers from Nf2/Arid1a LKO mice compared to livers from Nf2 LKO 
mice (Extended Data Fig. 5e). The liver knockout of Arid1a alone was 
inconsequential; mice remained healthy with an ostensibly normal liver 
for the entire duration of our experiments (Fig. 2d, e and Extended Data 
Fig. 5d, e). This indicates that increasing YAP/TAZ nuclear levels after 
Hippo pathway inactivation is insufficient for their full activation due 
to their nuclear inhibition by ARID1 A-SWI/SNF. 

Chronic tissue damage is a fundamental driver of liver carcinogenesis, 
causing continuous rounds of injury and compensatory proliferation». 
To assess the relevance of the ARID1A-YAP/TAZ association in this 
context, we fed Aridla LKO mice a diet supplemented with the toxic 
compound 3,5-diethoxycarbonyl-1,4-dihydrocollidine (DDC) for six 
weeks. DDC induced the appearance of ductular reactions around 
the portal areas in wild-type livers (Fig. 2f). Notably, in all Aridla 
LKO mice that were fed with a DDC diet, we found areas of cholan- 
giocarcinomatous transformation, with clear signs of atypia, massive 
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proliferation of tumour cells and increased Afp expression (Fig. 2f 
and Extended Data Fig. 5f-h). By contrast, these lesions were absent 
from livers of DDC-treated Arid1a/Yap/Taz triple-mutant mice, which 
were similar to control DDC-treated mice (Fig. 2f and Extended Data 
Fig. 5f, g). Thus, inhibition of YAP/TAZ is an essential mediator of the 
tumour-suppressive function of SWI/SNF in vivo. 

Data presented above indicate that the interaction between YAP/ 
TAZ and ARID1A-SWI/SNF represents a modality to inhibit YAP/TAZ 
activity inside the nucleus. We next investigated how this interaction 
is regulated. Here we focused on modulation by mechanical inputs 
through the F-actin cytoskeleton. YAP/TAZ respond to changes in cell 
shape and physical forces that are transmitted by the tissue to ultimately 
influence the organization of the F-actin cytoskeleton’, including 
nuclear F-actin organization'®!”. We verified that the organization of 
nuclear F-actin changed markedly in cells that experienced low com- 
pared to high levels of mechanical signalling. By expressing 3-actin 
fused to a nuclear localization sequence (NLS-{-actin)!* in HEK293T 
cells, we found that stretched cells (high mechanics) displayed a 
network of nuclear actin filaments that crossed the whole nucleoplasm 
(Fig. 3a), whereas a finer, almost exclusively perilaminar distribution 
was observed in cells confined to small adhesive areas (low mechanics) 
(Fig. 3a). 

The SWI/SNF complex has previously been reported to associate 
with purified F-actin in vitro’, raising the possibility that it could be 
modulated by mechanical signals in cells. Notably, using in situ 
proximity-ligation assays (PLA, a method that enables the investiga- 
tion of protein-protein interactions while preserving the structural 
integrity of the cells), we found an association between nuclear F-actin 
and endogenous SWI/SNF in nuclei of stretched cells, whereas at low 
mechanics (that is, on small adhesive areas) cells were almost devoid 
of signal (Fig. 3b). A form of B-actin”° that is unable to polymerize 
(NLS-6-actin(R62D)) did not interact with SWI/SNF in stretched 
cells (Extended Data Fig. 6c), supporting the view that only polym- 
erized actin can interact with SWI/SNF. In line with this hypothesis, 
purification of endogenous F-actin using biotinylated phalloidin on 
streptavidin beads leads to robust co-purification of the ARID1A-SWI/ 
SNF complex (Fig. 3c), but not in cells treated with the F-actin inhibitor 
latrunculin A (Fig. 3c). Notably, by sequential salt extraction of nuclei 
(see Methods), ARID1A-SWI/SNE co-purified in the same fractions 
as F-actin when we used extracts in which F-actin was preserved with 
phalloidin (Extended Data Fig. 6e). By contrast, in the presence of 
latrunculin A, a substantial amount of ARID1A-SWI/SNF relocalized 
to fractions that did not contain actin (Extended Data Fig. 6e), similar 
to the PLA protein-protein interactions in cells grown in conditions 
of high or low mechanical stress (Fig. 3b). 

We then investigated whether nuclear F-actin might interfere with 
the interactions between ARID1A-SWI/SNF and YAP. As shown in 
Fig. 3d, no ARID1A-SWI/SNF-YAP associations could be detected in 
human mammary epithelial (MCF10A) cell extracts prepared under 
conditions preserving F-actin or when SWI/SNEF was co-purified 
with endogenous YAP specifically in the absence of F-actin; however, 
this interaction was abolished by concomitant depletion of ARIDIA 
(Fig. 3d), as expected if ARID1A is required for YAP/TAZ incorpo- 
ration into this pool of SWI/SNF (see Fig. 1). These results were reca- 
pitulated in the nuclei of intact MCF10A cells by PLA: YAP interacts 
with endogenous BRM only in mechanically inhibited cells (Fig. 3e and 
Extended Data Fig. 6g). 

Mechanistically, we found that the inhibitory association of YAP/ 
TAZ with ARID1A-SWI/SNF is in fact alternative to the binding of 
YAP/TAZ to their DNA-binding platform, TEAD, which is necessary 
for YAP/TAZ-driven transcription”!. We showed this first in co- 
immunoprecipitation experiments, showing that ARID 1A associates 
with YAP, but not with TEAD proteins (Extended Data Fig. 6h). Then, 
we used PLA to monitor the dynamic of the YAP-TEAD1 interaction 
in nuclei of cells cultured under conditions of high compared to low 
mechanical signalling. As shown in Fig. 3fand Extended Data Fig. 6i, the 
YAP-TEAD association in MCF10A cell nuclei was severely inhibited 
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Fig. 4 | Loss of SWI/SNF rescues YAP/TAZ activities and biological 
effects in mechanically impaired cells. a, b, Under low mechanical 
conditions, ARIDIA loss rescues expression of CTGF in MCF10A cells (a) 
and Ankrd1 in Arid1a" fibroblasts (b; Cre and GFP indicate cells 
transduced with Cre- or GFP-encoding adenoviral vectors, respectively) 
in a YAP/TAZ-dependent manner. ARID1A depletion had no effect on 
YAP/TAZ localization (Extended Data Fig. 7a). Data are mean + s.d. of 
n= 3 biologically independent samples. c, Cell proliferation (measured by 
EdU (5-ethynyl-2’-deoxyuridine) incorporation) in confluent MCF10A 
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at low mechanical regimes, when YAP/TAZ are associated with SWI/ 
SNF; these conditions included inhibition of cellular mechanotrans- 
duction extracellularly (by culturing cells on small areas or soft extra- 
cellular matrix (ECM)), at the level of the integrin (with anti-integrin-B1 
blocking antibodies) or intracellularly (by using the Rho-inhibitor C3 
or inhibiting F-actin with latrunculin A). Notably, under the same con- 
ditions, depletion of ARIDIA was sufficient to restore the YAP-TEAD 
association (Fig. 3f). Taken together, the results suggest the following 
model: in mechanically impaired cells, YAP/TAZ are sequestered within 
the ARID 1A-containing pool of SWI/SNF complexes, away from TEAD. 
Conversely, in mechanically challenged cells, nuclear F-actin structures 
engage with ARID1A-SWI/SNF and induce YAP/TAZ detachment 
from that pool of SWI/SNF and their binding to TEAD. 

Following the above model, we next tested whether ARID1A-SWI/ 
SNF inactivation could rescue YAP/TAZ activity in mechanically 
inhibited cells. In epithelial cells (MCF10A, HaCaT) that experienced 
low mechanical stimulation, or in which mechanotransduction was 
inhibited (either by attenuating ECM mechanics or inhibiting intracel- 
lular mechanotransduction), the expression of YAP/TAZ target genes 
was strongly downregulated, as expected’; however, in all conditions, 
YAP/TAZ activity could be restored after ARID1A depletion using short 
interfering RNA (siRNA). Moreover, such rescue of gene expression 
in mechanically impaired cells was YAP/TAZ-dependent (Fig. 4a and 
Extended Data Fig. 7b, c). Similar results were obtained in Arid1 af 
mouse fibroblasts, in which the deletion of Aridla was achieved by 
infection with an adenoviral vector encoding Cre (Fig. 4b and Extended 
Data Fig. 7d). Therefore, our data indicate that ARID1A functionally 
contributes to the mechanical inhibition of YAP/TAZ, and that YAP/ 
TAZ are key mediators of the effect of ARID1A inactivation. 

If raising cell mechanics attenuates the ARID1A-SWI/SNF-YAP/ 
TAZ inhibitory axis through the F-actin cytoskeleton, then experi- 
mentally raising F-actin should be sufficient to overcome such inhi- 
bition. We tested this hypothesis through depletion of two F-actin 
severing proteins, ADF and Cofilin1, which act as cytoskeletal check- 
points of YAP/TAZ activation”; loss of ADF/Cofilin1 potently raised 
YAP/TAZ activity, to levels that could not be further modulated 
by ARIDIA inactivation (Extended Data Fig. 7e). This is consistent 
with the results obtained in fibroblasts cultured on a stiff ECM (Fig. 4b, 
high mechanics) and with the notion that, at maximal mechanical 
signalling, the ARIDIA-SWI/SNF-YAP/TAZ inhibitory axis is already 
disabled, making ARID1IA depletion under these conditions essentially 
inconsequential. 
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We next determined whether the loss of SWI/SNF induced 
reactivation of YAP/TAZ-driven biological responses in otherwise 
mechanically inhibited cells. YAP/TAZ inactivation by lowered cell 
mechanics is a main inducer of contact inhibition of proliferation 
(CIP) in a post-confluent epithelial sheet””. Indeed, contact inhibition 
of proliferation can be revoked (leading to S phase re-entry) through 
substantial Hippo-independent mechanical activation of YAP/TAZ, 
either by stretching the cell monolayer or remodelling the F-actin 
cytoskeleton”””*. Phenocopying the effects of raised cell mechanics, 
SWI/SNF inactivation also triggered S phase re-entry in post-confluent 
epithelial sheets in a YAP/TAZ-dependent manner (Fig. 4c and 
Extended Data Fig. 8a). 

Finally, we determined the role of cell mechanics in regulating YAP- 
driven changes in cell fate. For this, we used YAP-induced reprogram- 
ming of neurons into NSC-like cells!*, and hypothesized that such 
reprogramming should be disabled if cells are placed on a soft ECM, 
where the inhibitory function of ARID1A-SWI/SNF on YAP/TAZ 
is predominant. Indeed, only few neurospheres emerged from YAP- 
expressing wild-type neurons plated on a soft ECM compared to stiff 
ECM (Extended Data Fig. 8b); more relevantly, shRNA-mediated 
depletion of Aridla or Brm rescued the ability of YAP to reprogram 
neurons into NSC-like cells on soft ECM (Fig. 4d and Extended Data 
Fig. 8c-e). 

Our results shed light on the mechanisms of SWI/SNF tumour sup- 
pression, an aspect of cancer biology that has remained unclear. We 
found that SWI/SNEF binds to and inhibits YAP/TAZ; this inhibitory 
function is restricted to the ARID1A-containing fraction of SWI/SNF 
complexes. YAP/TAZ are essential and sufficient for the unfolding of 
complex cellular phenotypes inherent to inactivation of ARIDIA-SWI/ 
SNE, occurring at exceedingly high frequency in human malignancies’. 
Note that, in this scenario, pools of SWI/SNF that do not consist of 
ARID1A-SWI/SNE, such as those containing ARID1B (here shown 
to be irrelevant for YAP/TAZ regulation), remain in place to carry 
out other SWI/SNF functions, such as chromatin remodelling. In 
line with this, in living tissues and explanted cells, ARID1A appears 
to bea largely dispensable protein, for which a tumour-suppressive 
role becomes apparent under genetic or environmental conditions that 
lead to nuclear accumulation of YAP/TAZ. Of note, others before us 
have noted the interaction between SWI/SNF and TAZ, but concluded 
that SWI/SNF positively cooperated with TAZ-induced transcription 
of some targets”. We have been unable to confirm the generality of 
those conclusions in our analyses, which included multiple redundant 
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reagents, cellular systems and in vivo genetic models and which all 
instead point to the opposite conclusion. 

A second element of general interest concerns the mechanisms of 
YAP/TAZ mechanotransduction. Our results suggest the existence 
of a pathway that could be streamlined as follows: cell mechanics 
promotes the accumulation of nuclear F-actin, which binds the 
ARID1A-SWI/SNF pool, thus relieving YAP/TAZ from SWI/SNF 
inhibition. Biochemically, our data suggest that mechanical 
signals tune the ARID1A-SWI/SNF-YAP/TAZ inhibitory axis by 
controlling the levels and structural organization of the nuclear pool of 
F-actin. This nuclear pathway complements Hippo-independent and 
Hippo-regulated YAP/TAZ mechanotransduction that occurs in the 
cytoplasm?””. 

Our data argue in favour of a paradigm in which, to fully unleash 
YAP/TAZ activity, at least two requirements need to be met: promo- 
tion of YAP/TAZ nuclear accumulation and SWI/SNEF inhibition. 
Inactivation of the Hippo pathway alone is insufficient to fully enable 
YAP/TAZ activity in absence of concomitant inactivation of ARIDIA. 
This indicates that the response to a number of signals that promote 
YAP/TAZ nuclear localization—including loss of Hippo signalling— 
would also concomitantly require a proficient mechanical environment 
to surpass the ARIDIA-SWI/SNF barrier (Fig. 4e). It also suggests 
that nuclear levels of YAP/TAZ that are too low or transient to elicit 
any effect in normal cells may become above-threshold after genetic 
or mechanical inhibition of ARID1A-SWI/SNE More generally, the 
data show how a genetic lesion, such as loss of ARID1A in tumour cells, 
may serve as means to increase cellular responsiveness to an epigenetic 
signal, such as mechanotransduction. 


Online content 

Any methods, additional references, Nature Research reporting summaries, source 
data, statements of data availability and associated accession codes are available at 
https://doi.org/10.1038/s41586-018-0658-1. 


Received: 12 December 2017; Accepted: 7 September 2018; 
Published online 31 October 2018. 


1. Kadoch, C. & Crabtree, G. R. Mammalian SWI/SNF chromatin remodeling 
complexes and cancer: mechanistic insights gained from human genomics. 
Sci. Adv. 1, €1500447 (2015). 

2. Totaro, A., Panciera, T. & Piccolo, S. YAP/TAZ upstream signals and downstream 
responses. Nat. Cell Biol. 20, 888-899 (2018). 

3. Panciera, T., Azzolin, L., Cordenonsi, M. & Piccolo, S. Mechanobiology of YAP and 
TAZ in physiology and disease. Nat. Rev. Mol. Cell Biol. 18, 758-770 (2017). 

4. Rafiee, M.R., Girardot, C., Sigismondo, G. & Krijgsveld, J. Expanding the circuitry 
of pluripotency by selective isolation of chromatin-associated proteins. Mol. Cell 
64, 624-635 (2016). 

5. Ege, N. et al. Quantitative analysis reveals that actin and Src-family kinases 
regulate nuclear YAP1 and its export. Cell Syst. 6, 692-708 (2018). 

6. Chen, H. |. & Sudol, M. The WW domain of Yes-associated protein binds a 
proline-rich ligand that differs from the consensus established for Src 
homology 3-binding modules. Proc. Natl Acad. Sci. USA 92, 7819-7823 (1995). 

7. Dupont, S. et al. Role of YAP/TAZ in mechanotransduction. Nature 474, 
179-183 (2011). 

8. Wang, H. et al. BRCA1/FANCD2/BRG1-driven DNA repair stabilizes the 
differentiation state of human mammary epithelial cells. Mol. Cell 63, 
277-292 (2016). 

9. Cordenonsi, M. et al. The Hippo transducer TAZ confers cancer stem cell-related 
traits on breast cancer cells. Cel! 147, 759-772 (2011). 

10. Zanconato, F., Cordenonsi, M. & Piccolo, S. YAP/TAZ at the roots of cancer. 
Cancer Cell 29, 783-803 (2016). 

11. Eroglu, E. et al. SWI/SNF complex prevents lineage reversion and induces 
temporal patterning in neural stem cells. Ce// 156, 1259-1273 (2014). 


LETTER 


2. Panciera, T. et al. Induction of expandable tissue-specific stem/progenitor cells 
through transient expression of YAP/TAZ. Cell Stem Cell 19, 725-737 (2016). 

13. Zhang, N. et al. The Merlin/NF2 tumor suppressor functions through the YAP 
oncoprotein to regulate tissue homeostasis in mammals. Dev. Cel/ 19, 27-38 
(2010). 

4. Benhamouche, S. et al. Nf2/Merlin controls progenitor homeostasis and 
tumorigenesis in the liver. Genes Dev. 24, 1718-1730 (2010). 

5. Bakiri, L. & Wagner, E. F. Mouse models for liver cancer. Mol. Oncol. 7, 206-223 
(2013). 

6. Plessner, M., Melak, M., Chinchilla, P., Baarlink, C. & Grosse, R. Nuclear F-actin 
formation and reorganization upon cell spreading. J. Biol. Chem. 290, 
11209-11216 (2015). 

7. Grosse, R. & Vartiainen, M. K. To be or not to be assembled: progressing into 
nuclear actin filaments. Nat. Rev. Mol. Cell Biol. 14, 693-697 (2013). 

8. Baarlink, C., Wang, H. & Grosse, R. Nuclear actin network assembly by formins 
regulates the SRF coactivator MAL. Science 340, 864-867 (2013). 

9. Rando, O. J., Zhao, K., Janmey, P. & Crabtree, G. R. Phosphatidylinositol- 
dependent actin filament binding by the SWI/SNF-like BAF chromatin 
remodeling complex. Proc. Nat! Acad. Sci. USA 99, 2824-2829 (2002). 

20. Miralles, F., Posern, G., Zaromytidou, A. |. & Treisman, R. Actin dynamics control 

SRF activity by regulation of its coactivator MAL. Ce// 113, 329-342 (2003). 
21. Zanconato, F. et al. Genome-wide association between YAP/TAZ/TEAD and AP-1 
at enhancers drives oncogenic growth. Nat. Cell Biol. 17, 1218-1227 (2015). 
22. Aragona, M. et al. A mechanical checkpoint controls multicellular growth through 
YAP/TAZ regulation by actin-processing factors. Ce// 154, 1047-1059 (2013). 
23. Benham-Pyle, B. W., Pruitt, B. L. & Nelson, W. J. Mechanical strain induces 
E-cadherin-dependent Yap1 and 6-catenin activation to drive cell cycle entry. 
Science 348, 1024-1027 (2015). 

24. Skibinski, A. et al. The Hippo transducer TAZ interacts with the SWI/SNF 

complex to regulate breast epithelial lineage commitment. Cell Rep. 6, 

1059-1072 (2014). 


Acknowledgements We thank A. Fujimura for help with neuron preparation; 

G. Della Giustina for micropattern fabrication; V. Guzzardo for histology; 

C. Frasson and G. Basso for FACS; D. M. Livingston for HMECs and plasmids; 
D. J. Pan, M. Giovannini, Z. Wang, P. Chambon, and I. De Curtis and 

R. Brambilla for gifts of mice; R. Treisman for ACTB (encoding B-actin) cDNAs; 
L. Naldini for plasmids; S. Dupont for performing the initial experiments leading 
to biochemical identification of SWI/SNF and for the protocol to perform 
F-actin pull-down; Gianluca Grenci and Mona Suryana (MBI-Singapore) 

and the MBI microfabrication facility team for the supply of quartz masks. This 
work is supported by AIRC Special Program Molecular Clinical Oncology ‘5 per 
mille’, by an AIRC Pl-Grant, by a MIUR-FARE grant, and by Epigenetics Flagship 
project CNR-MIUR grants to S.P. This project has received funding from the 
European Research Council (ERC) under the European Union’s Horizon 2020 
research and innovation programme (DENOVOSTEM grant agreement No 
670126 to S.P). 


Reviewer information Nature thanks M. Sudol, P. Wade and the other 
anonymous reviewer(s) for their contribution to the peer review of this work. 


Author contributions L.C. carried out experiments in vitro, and LA. carried out 
experiments on mice. Roles of other coauthors: D.D.B., molecular biology and 
IFs; D.D.B. and R.L.X., liver experiments; G.Ba. and F.Z., molecular biology and 
preparation of samples for ChIP-MS; L.C. and T.P., neuronal reprogramming; 
S.G., hydrogel preparation; G.S. and J.K. for mass spectroscopy; M.F., histology 
and histopathological evaluations; G.Br. and A.G., microfabrication. S.P. and 
M.C. conceived the initial hypothesis and experimental design, and planned, 
discussed and organized the work. L.C., L.A., F.Z., M.C. and S.P. wrote the 
manuscript. 


Competing interests The authors declare no competing interests. 


Additional information 

Extended data is available for this paper at https://doi.org/10.1038/s41586- 
018-0658-1. 

Supplementary information is available for this paper at https://doi.org/ 
10.1038/s41586-018-0658-1. 

Reprints and permissions information is available at http://www.nature.com/ 
reprints. 

Correspondence and requests for materials should be addressed to M.C. or S.P. 
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional 
claims in published maps and institutional affiliations. 


8 NOVEMBER 2018 | VOL 563 | NATURE | 269 


© 2018 Springer Nature Limited. All rights reserved. 


LETTER 


METHODS 


Reagents and plasmids. Latrunculin A, phalloidin, cerivastatin and tamoxifen 
were obtained from Sigma-Aldrich. Doxycycline was obtained from Calbiochem. 
Growth-factor-reduced Matrigel was obtained from Corning. C3 was obtained 
from Cytoskeleton Inc. Dasatinib was obtained from Selleckchem. Fasudil was 
obtained from Tocris. Anti-integrin-31 antibody (P5D2) was obtained from DSHB, 
University of Iowa. Cre- and GFP-expressing adenoviruses were obtained from the 
Gene Transfer Vector Core, University of Iowa. 

HA-human YAP(5SA) (all five serines, which are phosphorylated by 
LATS1 and LATS2, were mutated to alanines in YAP), Flag-human YAP(WT), 
Flag-YAP(S94A) and Flag—human YAP(5SA) were cloned in pcDNA3.1 for 
transient expression, or in pBABE retroviral plasmids to establish stable 
cell lines. The pBABE-Puro empty vector was used as control for retroviral 
transduction. siRNA-insensitive Flag-YAP(WW1™"), Flag-YAP(WW2™") 
and Flag~YAP(WW1/2™") were generated by PCR from the corresponding 
original cDNAs (a gift from M. Sudol, Addgene plasmids 19046, 19047 and 
19048, respectively’) and subcloned in pcDNA3.1 for transient expression. 
pCS2-Flag-mouse TAZ(WT) or pCS2-Flag-mouse TAZ(AWW) (deletion 
of residues 110-159) were as in a previous publication”®. pCS2-Flag-BRM 
was obtained by subcloning Flag~-BRM from pBABEpuro-Flag-human BRM 
(a gift from R. Kingston, Addgene plasmid 1961’) into pCS2. pCS2-Flag- 
BRGI was obtained by subcloning Flag~BRG1 from pBABEpuro-Flag-human 
BRGI (a gift from R. Kingston, Addgene plasmid 19597) into pC$2. pcD- 
NA6-V5-ARID1A(WT) was a gift from I.-M. Shih (Addgene plasmid 393117). 
pcDNA6-V5-ARID1A(PPxA) (containing the PPAY'“*/PPGY°» to PPAA!4*/ 
PPGA?®!> mutations) was generated as follows: the N-terminal cDNA frag- 
ment of ARID1A containing the Y148A/Y915A mutations was synthesized by 
GeneScript and swapped into pcDNA6-V5-ARID1A(WT) by using the Nhel/ 
Hpal restriction sites. 

For doxycycline-inducible expression of YAP in MCF10A cells, cDNA of NLS- 
YAP was subcloned in pCW-MCS, obtained by substituting the sequence between 
Nhel and BamHI of pCW57.1 (a gift from D. Root, Addgene 41393) with a new 
multiple cloning site (MCS). 

For inducible expression of YAP in mouse neurons, FUW-tetO-YAP(WT) and 
FUW-tetO-YAP(S94A) (deposited as Addgene plasmids 84009 and #84010!, 
respectively) were used in combination with FUdeltaGW-rtTA (a gift from 
K. Hochedlinger, Addgene 19780”). Empty vector (FUW-tetO-MCS, Addgene 
84008) was used as negative control. 

The constructs for control shRNA, Aridla shRNA and Brm shRNA expression 
in primary neurons were prepared by cloning the control shRNA (shCo), mouse 
Aridla shRNA (shAridla#2 and shAridla#2), mouse Brm shRNA (shBrm#1 and 
shBrm#2) sequences (see ‘RNA interference’) into the pLKO.1-puro lentiviral vector 
(a gift from B. Weinberg, Addgene 8453°°) according to the manufacturer’s 
protocol. 

For stable shRNA infection of HMECs, we used pLKO.1-puro lentiviral vectors 
expressing control shRNA (see RNA interference), BRG1 shRNA® and ARIDIA 
shRNA (from Sigma-Aldrich) in combination with psUPER-RETRO-BLASTI 
vectors containing the GFP or TAZ RNA-interference sequences (as previously 
described’). 

Plasmids encoding Flag~NLS-$-actin(WT) and Flag~NLS--actin(R62D) were 
generated by PCR from original cDNAs provided by R. Treisman” and cloned in 
pcDNA3.1. 

For glutathione S-transferase (GST) pull-down experiments, full-length mouse 
TAZ and human YAP1 were cloned in pGEX4T1. 

All constructs were confirmed by sequencing. 

Micropatterns. The following procedure was used to make the adhesive micropat- 
terns: a layer of photoresist (MICROPOSIT $1805 G2 Positive Photoresist, Dow) 
was spin-coated (3,000 r.p.m. for 30 s) ona glass substrate, functionalized with 
trimethoxysilylpropyl methacrylate, and cured at 120°C for 1 min. The positive 
resist was patterned by ultraviolet-light (UV) exposure for 8 s in air by irradiation 
with a collimated UV lamp at 365 nm (UV365, Reinraumtechnik lanz) through a 
quartz chromium mask with the desired pattern (arrays of 10,1m x 10j1m squares). 
The exposed areas, those around the squares, were removed by immersing the 
substrate in the developer solution MF 319 for 8 s. To polymerize non-adhesive 
polyacrylamide brushes outside of the squares, a drop of acrylamide solution (8% w/v 
in water with 0.225 w/v of ammonium persulfate and 1.5% v/v tetramethylethyl- 
enediamine) was put between the patterned glass and a blank coverslip and left 
to react for 30 min in air. The sandwich structure was detached by immersing it 
for 30 min in water; the functionalized pattern was then put in water overnight to 
completely remove unpolymerized acrylamide. The unexposed resist (the square 
areas) was stripped in acetone for 30 s and rinsed in water. Finally, after sterilization 
under UV light, the square areas were functionalized with fibronectin by putting 
a drop of protein solution (101g ml! in water) on top, leaving it to react for 1 h 
and then rinsing it in PBS. 


Cell lines and treatments. HMECs were a gift from D. Livingston* (DFCI) and 
were cultured in MEGM medium (Lonza). MCF10A and MCFIOAT (also called 
MII) cells were a gift from F. Miller (Karmanos) and were cultured in DMEM/F12 
(Gibco) with 5% horse serum, glutamine and antibiotics, freshly supplemented 
with insulin (Sigma-Aldrich), EGF (Peprotech), hydrocortisone (Sigma-Aldrich) 
and cholera toxin (Sigma-Aldrich). HEK293 or HEK293T cells were from ATCC 
and were cultured in DMEM (Gibco) supplemented with 10% fetal bovine serum 
(FBS), glutamine and antibiotics. HaCaT cells were a gift from N. Fusening (DKFZ) 
and were cultured in DMEM (Gibco) supplemented with 10% FBS, glutamine and 
antibiotics. HEK293, HEK293T, MCF10A, MCFIOAT and HaCaT were authenti- 
cated by DSMZ/Eurofins Genomics. All cell lines tested negative for mycoplasma 
contamination. 

For experiments with NLS-YAP-transduced MCF10A cells, cells were treated 
with 0.5j1g ml~! doxycycline in culture medium for the whole duration of the 
experiments. 

For stiff versus soft ECM experiments, cells plated on standard fibronectin- 
coated tissue culture supports or on fibronectin-coated >40 KPa hydrogels 
(produced as described previously’) were considered as cultured on a ‘stiff ECM’ 
under the high mechanical conditions, as indicated in the figures. For experi- 
ments on soft ECM, 5,000-10,000 cells per cm? were seeded ina drop on top 
of 0.7-kPa fibronectin-coated hydrogels; after attachment, the wells containing 
the hydrogels were filled with the appropriate medium. Cells were collected for 
immunofluorescence or RNA extraction after 24 h. For experiments with cells 
experiencing small cell-ECM adhesion in ultra-confluent monolayers”’, we 
plated 200,000 cells per cm? in the appropriate well (that is, plated at approxi- 
mately 150% confluency). Cells were collected for immunofluorescence or RNA 
extraction after 48 h. For experiments with fibronectin-coated micropatterns, 
cells were seeded on fibronectin-coated micropatterns (100 1m? ‘small area’ in 
Fig. 3); after attachment, floating cells were removed and wells were filled with 
medium; cells were fixed 24 h later. These cells were compared to cells plated on 
an unpatterned/unconfined adhesive area (defined as stretched cells and labelled 
as ‘high mechanics’). 

Latrunculin A was used at a final concentration of 0.5 1M for the time indi- 
cated in the description of F-actin pull-down, sequential salt extraction and in 
situ proximity ligation assay 

experiments. C3 was used at a final concentration of 0.5 4g ml“! in culture 
medium for 24 h. Dasatinib was used at a final concentration of 0.1 1M for 24 h. 
Fasudil was used at a final concentration of 10,1.M for 24 h. Anti-integrin-31 was 
used at a final concentration of 0.23 1g ml“! for 24 h. Cerivastatin was used at a 
final concentration of 541M for 24h. 

Primary fibroblasts (from biopsies of adult mouse ears) were cultured in DMEM 
(Gibco) supplemented with 20% FBS, glutamine and antibiotics. For the experi- 
ments depicted in Fig. 4b and Extended Data Fig. 7d, fibroblasts were transduced 
with adenoviral vectors and transfected with the indicated siRNAs (day 0), replated 
either on a soft or a stiff ECM (day 1), and then collected for RNA extraction 48 h 
later (day 3). 

RNA interference. siRNA transfections were done with Lipofectamine RNAi- 
MAX (Thermo Fisher Scientific) in antibiotics-free medium according to the 
manufacturer's instructions. Sequences of siRNAs are provided in Supplementary 
Table 3. 

Western blot. Cells were collected in lysis buffer (50 mM HEPES (pH 7.5), 100 mM 
NaCl, 50 mM KCl, 1% Triton X-100, 5% glycerol, 0.5% NP-40, 2 mM MgCh, 14M 
DTT, and phosphatase and protease inhibitors) and lysed at 4°C by sonication. 
Extracts were quantified using the Bradford method. Proteins were run on 4-12% 
NuPAGE-MOPS acrylamide gels (ThermoFisher) and transferred onto PVDF 
membranes by wet electrophoretic transfer. Blots were blocked with 0.5% non-fat 
dry milk and incubated overnight at 4°C with primary antibodies. Secondary anti- 
bodies were incubated for 1 h at room temperature, and then blots were developed 
with chemiluminescent reagents. Images were acquired with Image Quant LAS 
4000 1.2 (GE healthcare). 

For western blot: anti-YAP/TAZ (sc-101199), anti- BAF53A (sc-137062 or 
sc47808), anti-BRG1 (sc-10768 or sc-17796), anti-lamin B (sc-6216), anti- 
SMARCCI (also known as BAF155) (sc-137138 or sc-9746), anti-SNF5 (sc- 
166165), anti-vimentin (sc-7557-r), anti-gelsolin (sc-57509) and anti- TEAD4 
(sc-101184) were from Santa Cruz; anti-ARID1A (HPA005456), anti-SNF5 
(HPA018248), anti-TAZ (HPA007415) and anti-3-actin (A5316) were from Sigma- 
Aldrich; anti-YAP (ab52771), anti-histone H3 (ab1791) and anti-BRM (ab15597) 
were from Abcam; anti-GAPDH (MAB347) and anti-ARID1A (04-080) monoclo- 
nal antibodies were from Millipore. Anti-E-cadherin (610181) and anti-TEAD1 
(610922) were from BD. Anti-phosphorylated YAP (S127) (CST 4911) was from 
Cell Signaling Technology. 

Horseradish-peroxidase-conjugated anti-Flag (clone M2, A8592) was from 
Sigma-Aldrich, anti-HA (A190-107P) was from Bethyl and the anti-V5 antibody 
was from Abcam (ab27671). 
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Unless otherwise specified, loading controls for all blots were run on the same gel. 
F-actin pull-down experiments. For the experiments depicted in Fig. 3c, cells 
were plated in sparse conditions and treated for 4 h with latrunculin A (0.5}1M) 
or biotinylated phalloidin (40 ng ml“). After treatment, cells were washed with 
prewarmed HBSS once and collected in ‘actin lysis buffer’ (20 mM HEPES (pH 7.5), 
50 mM KCl, 0.1% Triton X-100, 5% glycerol, 0.1% NP-40, 5 mM MgCl, 144M 
DTT, 10M MG115, 10)1M MG132, 1 mM ATP, 20M phosphocreatine di(Tris) 
salt (P1937, Sigma-Aldrich), and phosphatase and protease inhibitors). All buffers 
were freshly prepared and prewarmed at room temperature. Cells were scraped 
and mechanically lysed by passaging ten times through a 26G-needle syringe at 
room temperature. For latrunculin-A-treated cells, latrunculin A was also present 
in the buffers (11M) used for collecting cells and pull-down experiments to avoid 
any F-actin re-assembly; for the biotinylated-phalloidin-treated cells, biotinylated 
phalloidin was also present in the buffers used for collecting cells and pull-down 
experiments (40 ng ml~!). Extracts were cleared by centrifugation (10,000g in 
low-retention Eppendorf tubes) at room temperature and supernatants were 
immediately (we never froze supernatants) incubated at room temperature for 
3 h with streptavidin-conjugated resin (Sigma-Aldrich) and biotinylated phalloidin 
(1g ml’). Phalloidin-bound complexes were then washed with actin lysis buffer 
three times at room temperature, resuspended in SDS sample buffer, incubated at 
95°C for 3 min, and subjected to SDS-PAGE and western blot analysis. 
Sequential salt extraction. We have adapted a sequential salt-extraction assay for 
evaluating the chromatin-binding affinities of the SWI/SNF complex in HEK293T 
cells. All buffers were freshly prepared and prewarmed at room temperature before 
use and all procedures were carried out at room temperature. Nuclei were isolated 
from confluent HEK293T cells grown on 10-cm dishes by hypotonic lysis in 5 ml 
buffer 1 (20 mM HEPES (pH 7.5), 10 mM KCl, 0.1% NP-40, 5% glycerol, 5 mM 
MgCh, 11M DTT, 1014M MG115, 10p.M MG132, 1 mM ATP, 20M phosphocre- 
atine di(Tris) salt, and phosphatase and protease inhibitors) for 5 min. After cen- 
trifugation at 600g for 3 min, the supernatant was saved for western blot analysis, 
whereas the nuclear pellet was sequentially resuspended and centrifuged at 6,000g 
for 3 min in buffer 1 supplemented with increasing concentrations of NaCl (from 
0 to 600 mM), as indicated in Extended Data Fig. 6e. The released proteins in each 
fraction were directly analysed by SDS-PAGE and western blot. 

For latrunculin-A-treated cells (0.5\.M, 4 h treatment), latrunculin A (11M) 
was also present in all the buffers used for collecting cells and the salt-extraction 
assay, to avoid any F-actin re-assembly; for phalloidin-treated cells, phalloidin 
(501M) was also present in all the buffers used for collecting cells and the salt- 
extraction assay. 

Co-immunoprecipitation of endogenous proteins. For immunoprecipitation 
experiments of endogenous proteins shown in Fig. 1a, c and Extended Data Fig. 6h, 
cells were plated (day 0), transfected with the indicated siRNAs (day 1), collected 
two days after siRNA transfection (day 4) and lysed by sonication in lysis buffer 
(50 mM HEPES (pH 7.5), 100 mM NaCl, 50 mM KCL, 1% Triton X-100, 5% glycerol, 
0.5% NP-40, 2 mM MgCh, 1.M DTT, and phosphatase and protease inhibitors). 
Extracts were cleared by centrifugation and incubated with anti-ARID1A antibody 
(sc-98441, Santa Cruz) or control anti-HA antibody (sc-805, Santa Cruz), immo- 
bilized on protein A-sepharose beads at 4°C for 3 h. Immunocomplexes were 
then washed with cold lysis buffer three times, resuspended in SDS sample buffer, 
incubated at 95°C for 3 min and subjected to SDS-PAGE and western blot analysis. 

For the experiments depicted in Fig. 3d, cells were plated in sparse conditions, 
and treated and collected as described above for F-actin pull-down experiments. 
Extracts were cleared by centrifugation at room temperature and incubated with 
anti-YAP antibody (ab52771, Abcam) immobilized on protein A-Sepharose beads 
for 3 h at room temperature. Immunocomplexes were then washed with actin lysis 
buffer (see ‘F-actin pull-down experiments’) three times at room temperature, 
resuspended in SDS sample buffer, incubated at 95°C for 3 min, and subjected to 
SDS-PAGE and western blot analysis. 

Co-immunoprecipitation of tagged proteins. Cells were collected and lysed 
by sonication in lysis buffer (50 mM HEPES (pH 7.5), 100 mM NaCl, 50 mM 
KCl, 1% Triton X-100, 5% glycerol, 0.5% NP-40, 2mM MgCl, 11M DTT, and 
phosphatase and protease inhibitors) and extracts were cleared by centrifuga- 
tion at 4°C. Extracts were incubated for 3 h at 4°C with anti-Flag resin (Sigma- 
Aldrich). Immunocomplexes were then washed with cold lysis buffer three times, 
resuspended in SDS sample buffer, incubated at 95°C for 3 min, and subjected 
to SDS-PAGE and western blot analysis. Inputs were loaded based on Bradford 
assay measurements. In particular, for Fig. 1b, Extended Data Fig. Ic, g, i-k, we 
used lysates from HEK293T cells transfected with the indicated plasmids (con- 
centrations of plasmids were as follows). For Fig. 1b: Flag~BRM, 83 ng cm’; 
HA-YAP(5SA), 17 ng cm~?. For Extended Data Fig. 1c: Flag~BRG1, 83 ng cm; 
HA-YAP(5SA), 17 ng cm~*. For Extended Data Fig. 1g: HA~YAP(5SA), 
17 ng cm ~*; Flag-BRG1, 83 ng cm *. For Extended Data Fig. li: Flag-YAP(WT), 
83 ngcm *; Flag~YAP(WW1"™"), 83 ng cm ~*; Flag-YAP(WW2™"), 83 ng cm’; 
Flag-YAP(WW1/2™"), 83 ng cm. For Extended Data Fig. 1j: Flag-TAZ(WT), 
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83ngcm ?; Flag~-TAZ(AWW), 83 ng cm *. For Extended Data Fig. 1k: Flag-YAP, 
83 ng cm~*; V5-ARID1A(WT), 166 ng cm~*; V5-ARID1A(PPxA), 166 ngcm~?. 
Lysates were collected 48 h after transfection. Where indicated, siRNAs were trans- 
fected 24 h before DNA transfection. For Extended Data Fig. 1b, we used lysates 
from empty vector-transduced MCF10A cells or MCF10A cells constitutively 
expressing Flag~YAP(5SA). 

GST pull-down experiments. For the experiment in Extended Data Fig. 1h, 
V5-ARIDIA was purified from transfected HEK293T cells. In brief, cells were 
transfected with pcDNA6-V5-ARID1A, collected and lysed by sonication in lysis 
buffer (50 mM HEPES (pH 7.5), 100 mM NaCl, 50 mM KCl, 1% Triton X-100, 
5% glycerol, 0.5% NP-40, 2 mM MgCl, 111M DTT, and phosphatase and protease 
inhibitors) and extracts were cleared by centrifugation at 4°C. Extracts were incu- 
bated for 3 h at 4°C with anti-V5 resin (Sigma-Aldrich). After washing three times 
with lysis buffer (2 min for each wash at room temperature), V5-ARID1A protein 
was eluted by incubation with V5 peptide (V7754, Sigma-Aldrich) in lysis buffer. 
V5 resin was eliminated by centrifugation. For the GST pull-down experiments, 
beads with purified proteins (GST-YAP or GST-TAZ, as indicated) were incubated 
with purified V5-ARID1A in lysis buffer for 3 h at 4°C. After three washes, GST 
pull-down proteins were analysed by western blot. 

For the experiment in Extended Data Fig. 1f, beads with purified GST-YAP were 
incubated for 3 h at 4°C with the 0-mM NaCl fraction, which contained proteins 
released from DNase-treated nuclei of HEK293T cells. To prepare such extracts, 
nuclei of HEK293T cells were isolated from confluent HEK293T cells grown on 
10-cm dishes by hypotonic lysis in 5 ml buffer 1 for 5 min. After centrifugation at 
600g for 3 min, the supernatant was saved for western blot analysis, whereas the 
nuclear pellet was subjected to DNase treatment for 30 min at 37°C in buffer 1 
supplemented with 1 mM CaCl). After centrifugation at 6,000g for 3 min, the 
supernatant was discarded and the DNase-treated nuclear pellet was sequentially 
resuspended and centrifuged at 6,000g for 3 min in buffer 1 supplemented with 
increasing concentrations of NaCl (from 0 to 600 mM). The 0-mM NaCl fraction 
was used for GST pull-down experiments. After three washes in buffer 1, GST 
pull-down proteins were then analysed by western blot. 

Identification of native YAP/TAZ complexes by mass spectrometry. Live cells 
were cross-linked with 1% formaldehyde (Sigma-Aldrich) in culture medium for 
10 min at room temperature before collection. Lysis was achieved by consecutive 
incubations in lysis buffer 1 (50 mM HEPES, pH 7.5, 10 mM NaCl, 1 mM EDTA, 
10% glycerol, 0.5% NP-40 and 0.25% Triton X-100), lysis buffer 2 (10 mM 
Tris-HCl pH 8, 200 mM NaCl, 1 mM EDTA and 0.5 mM EGTA) and lysis buffer 3 
(10 mM Tris-HCl pH 8, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 0.1% sodium- 
deoxycholate and 0.5% N-lauroylsarcosine), followed by sonication with a Branson 
Sonifier 450D. Immunoprecipitation was performed by incubating cleared extracts 
(corresponding to 2 x 10° cells) with 20 wg of antibody (anti- YAP: EP1674Y, 
Abcam; anti-TAZ: HPA007415, Sigma-Aldrich; pre-immune rabbit IgG: 15006, 
Sigma-Aldrich) and 10011 of Dynabeads—protein G (Invitrogen). After exten- 
sive washing, immunoprecipitates were eluted in 7.5% SDS, 200 mM DTT and 
de-crosslinked. After alkylation with iodoacetamide, proteins were purified with 
SP3 beads as previously described*! resuspended in 50 mM ammonium bicarbo- 
nate and digested with trypsin. Peptides were subjected to SP3 cleanup and they 
were eluted in 0.1% trifluoroacetic acid. Samples were analysed on an Orbitrap 
Fusion mass spectrometer (Thermo Fisher). 

Quantitative real-time PCR (qPCR). Cells were collected using the RNeasy 
Mini Kit (Qiagen) for total RNA extraction, and contaminant DNA was removed 
by DNase treatment. Total RNA from fibroblasts (Fig. 4b and Extended Data 
Fig. 7d) and from livers (Extended Data Fig. 5e, g) was extracted using TriZOL 
(ThermoFisher) and NucleoSpin RNA (MACHEREY-NAGEL, 740955.250), 
respectively. qPCR analyses were carried out on reverse-transcribed cDNAs 
with QuantStudio5 (applied Biosystems, ThermoFisher Scientific) and analysed 
with QuantStudio Design & Analysis software (version 1.4.3). Expression levels 
are always normalized to GAPDH. PCR oligonucleotide sequences are listed in 
Supplementary Table 2. 

Proliferation assay (EdU staining). Cells were first transfected with indicated 
siRNAs under standard culture conditions. The day after, cells were replated in 
fibronectin-coated glass chamber slides. After 24 h, EdU (10|1M) was added to 
the culture medium for 1 h. Cells were then fixed in 4% paraformaldehyde (PFA) 
for 10 min at room temperature. The EdU assays were performed according the 
manufacturer's instructions (Click-iT EdU Imaging Kits, Invitrogen). Images were 
obtained with a Leica TCS SP5 equipped with a CCD camera and analysed using 
Volocity software (PerkinElmer, version 5.5.1). 

Immunofluorescence. Immunofluorescence on PFA-fixed cells and on PFA-fixed 
paraffin-embedded tissue slices was performed as previously described”. 

Primary antibodies were: anti-YAP/TAZ (sc-101199, Santa Cruz), anti- 
cytokeratin (wide spectrum screening, ZO622; Dako), anti-E-cadherin (610181, 
BD) and anti-Flag (F1804, Sigma-Aldrich). Secondary antibodies (1:200) were 
from Molecular Probes. Samples were counterstained with ProLong-DAPI 
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(Molecular Probes, Life Technologies) to label cell nuclei. Confocal images were 
obtained with a Leica TCS SP5 equipped with a CCD camera and analysed using 
Volocity software (PerkinElmer, version 5.5.1). 

Immunohistochemical staining experiments were performed on PFA-fixed, 
paraffin-embedded tissue sections as previously described’. For immunohisto- 
chemistry: anti-Ki-67 polyclonal antibody (clone SP6; M3062) was from Spring 
Bioscience; anti- YAP (13584-I-AP) was from Proteintech. 

In situ proximity ligation assay (PLA). In situ PLAs were performed with Duolink 
in situ reagents (Sigma-Aldrich). 

For the experiments in Fig. 3b and Extended Data Fig. 6b, c, HEK293T cells 
were plated in standard cell culture dishes on day 1. Cells were transfected with 
the indicated DNA plasmids on day 2 (concentration of plasmids were as follows). 
Flag-NLS-6-actin(WT),150 ng cm~?; Flag-NLS-(-actin(R62D),150 ng cm’; 
empty vector, 150 ng cm~”). On day 3, cells were transfected with the indicated 
siRNAs. On day 4, cells were replated into fibronectin-coated glass chamber slides. 
Cells from each condition were plated in duplicate. After 24 h, cells were fixed in 
4% PFA for 10 min at room temperature. With one of the duplicates, we performed 
anti-Flag (F-1804, Sigma-Aldrich) immunofluorescence in order to check the DNA 
transfection efficiency: the percentage of transfected cells was used for normal- 
ization. The other duplicate was subjected to PLA, following the manufacturer's 
instructions. Primary antibodies used in the PLA are: anti-Flag (F-1804, Sigma- 
Aldrich) and anti-BRM (ab15597, Abcam). 

For the experiments in Fig. 3e, f and Extended Data Fig. 6g, i, j, NLS-YAP 
MCFIOA cells were treated with doxycycline to induce NLS-YAP expression for 
the whole duration of the experiments. For Fig. 3e and Extended Data Fig. 6g, 
cells were either plated on small/micropatterned fibronectin-coated areas or 
treated for 24 h with latrunculin A (0.5,1M) or dasatinib (0.1 1M). For Fig. 3f and 
Extended Data Fig. 6i, j, cells were first transfected with siRNAs (control siRNA 
or ARIDIA siRNA) and replated on small micropatterned/fibronectin-coated 
areas, soft (0.7 KPa) hydrogels, or treated for 24 h with latrunculin A (0.5 1M), C3 
(0.5 1g ml’) or anti-integrin-81 (0.23 1g ml~'). Cells at ‘high mechanics’ were 
plated on unpatterned fibronectin-coated chamber slides. Cells were then fixed 
in 4% PFA for 10 min at room temperature. Samples were subjected to PLA, fol- 
lowing the manufacturer's instructions. Primary antibodies used in the PLA are: 
anti- YAP (sc-101199, Santa Cruz) and anti-BRM (ab15597, Abcam) for Fig. 3e and 
Extended Data Fig. 6g; anti- YAP (ab52771, Abcam) and anti-TEAD1 (610922, BD 
Biosciences) for Fig. 3f and Extended Data Fig. 6i, j. 

Images were acquired with a Leica TCS SP5 confocal microscope equipped with 
a CCD camera and analysed using Volocity software (PerkinElmer, version 5.5.1). 

The percentages of PLA-positive cells reported in the legend of Fig. 3 have 

been determined by manual counting of at least 90 cells for each experimental 
condition. 
Lenti- and retrovirus preparation. Lentiviral particles were prepared by tran- 
siently transfecting HEK293T (as previously described’) with lentiviral vectors 
(10,1g per 60-cm? dish) together with packaging vectors pMD2-VSVG (2.5,1g) and 
pPAX2 (7.5,1g) using TransIT-LT1 (Mirus Bio) according to the manufacturer's 
instructions. 

Retroviral particles were prepared by transiently transfecting HEK293GP 
(Takara) with retroviral vectors (151g per 60-cm? dish) together with pMD2-Env 
(541g per 60-cm? dish) using TransIT-LT1. Infections were carried out as previously 
described’. 

Mammosphere assays. Confluent monolayers of HMECs were trypsinized, 
counted and plated as single-cell suspensions (with a density of 1,000 cells per 
cm?) on ultra-low attachment plates (Costar). Cells were cultured in DMEM/F12 
supplemented with 1 x B27 (Invitrogen), glutamine, antibiotics, 5 1g ml"! insulin 
(Sigma-Aldrich), 20 ng ml~! EGF (Peprotech), 0.5 1g ml hydrocortisone (Sigma- 
Aldrich), 52j1g ml~! BPE (Thermo Fisher), 20 ng ml“! bFGF (Peprotech) and 
4ug ml“! heparin. Mammospheres were counted after 10-14 days. 

Luciferase assays. Luciferase assays were performed in HEK293 cells with the 
established YAP/TAZ-responsive luciferase reporter 8x GTIIC-Lux’. 

8x GTIIC-Lux reporter (50 ng cm~*) was transfected together with CMV-(-gal 
(75 ng cm~*) to normalize for transfection effiency using a CPRG (Roche) 
colorimetic assay. DNA transfections were done with TransitLT1 (Mirus Bio) 
according to the manufacturer’s instructions. DNA content in all samples was kept 
uniform by adding a pBluescript plasmid at concentrations up to 250 ng cm~?. 
For experiments using siRNA-depleted cells (Fig. 1d and Extended Data Fig. 2a, 
b), cells were plated at 15% confluence (day 0), transfected with the indicated 
siRNAs (day 1), changed to culture medium and transfected with plasmid DNA 
(concentrations of plasmids: for Extended Data Fig. 2a: empty vector was 
2 ng cm’, Flag~YAP(WT) was 2 ng cm ~*; for Extended Data Fig. 2b: Flag- 
YAP(WT) was 2 ng cm~?, Flag-YAP(WW1™") was 21 ng cm~”)) (day 2), and 
collected 48 h later (day 4). 

Primary neuron isolation, infection and culturing. Preparation of neurons, 
transduction and culturing were performed as previously described". In brief, 


neurons were isolated from hippocampi of embryonic day (E)18-E19 embryos of 
the indicated genotypes and plated on poly-t-lysine-coated wells (stiff conditions) 
or on top of a thick 0.5-cm Matrigel layer (soft conditions) in DMEM supple- 
mented with 10% FBS, glutamine and antibiotics (day 1). After 24 h (day 2), the 
medium of the hippocampal preparation was changed to fresh DMEM:Neurobasal 
(1:1) supplemented with 5% FBS, 1x B27, glutamine and antibiotics. For repro- 
gramming experiments, neurons were infected on the following day (day 3) with 
FUW-tetO-YAP(WT) and FUdeltaGW-rtTA viral supernatants. Negative con- 
trols were provided by neurons transduced with FUdeltaGW-rtTA in combina- 
tion with FUW-tetO-YAP(S94A) or empty vector. After 24 h (day 4), medium 
was changed and cells were incubated in Neurobasal medium supplemented with 
1x B27, glutamine, antibiotics,and 51M Ara-C (cytosine 8-p-arabinofuranoside, 
Sigma-Aldrich) for an additional seven days, at the end of which well-differentiated 
neurons were visible. 

For the experiments in Extended Data Figs. 4e, 8b, neurons were switched to 
NSC medium and 21g ml“! doxycycline for activating tetracycline-inducible gene 
expression. Sphere formation was evident upon YAP induction after 14 days on 
stiff ECM with doxycycline treatment. 

For the experiments in Figs. 2c, 4d and Extended Data Figs. 4b, c, 8c, d, 
after Ara-C treatment, neurons were infected with pLKO.1-shRNA vectors. For 
the infection of a 10-cm? plate, we mixed 500 il of pLKO.1-shRNA produced 
in NSC medium (DMEM/F12 supplemented with 1 x N2, 20 ng ml! mouse 
EGF, 20 ng ml“! mouse bFGE, glutamine and antibiotics) and 1.5 ml of serum- 
free Neurobasal medium with 1 x B27. After 24 h of infection, treated neurons 
were switched to NSC medium and 2g ml“! doxycycline to activate tetra- 
cycline-inducible gene expression. After seven days, fresh doxycycline (final 
concentration of 21g ml~') was added. Sphere formation was evident upon 
YAP induction after 14 days (stiff conditions) or 30-45 days (soft conditions) 
of doxycycline treatment. 

Bright-field images were acquired with a Leica DM IRB microscope using LAS 

version 4.4 software. 
Mice. Transgenic lines used in the experiments were provided by: D. Pan (Nf2") ar 
Z. Wang (Arid1 @I. these mice have loxP sites flanking exon 8)°?; BP Chambon 
(Alb“?ER7?)33; 1, De Curtis and R. Brambilla (Syn 1°). Taz and double Yap" 
Taz" conditional knockout mice were as previously described”. 

Animals were genotyped using standard procedures” and using the recom- 
mended set of primers. Animal experiments were performed adhering to our insti- 
tutional and national guidelines as approved by OPBA (University of Padova) and 
the Ministery of Health of Italy. For experiments using mice, the limits for the end 
point ‘body-condition scoring’ were never exceeded. 

For the experiment in Extended Data Fig. 4e, we used control (Arid1 d+) 
and Syn1°Aridla™* mice. For this, we crossed Syn 1‘ females (as transgene 
expression in male mice results in germline recombination*”) with Aridla! 
males. Littermate embryos derived from these crossings were collected at 
E18-E19 and kept separate for neurons derivation and following treatments 
(as described in ‘Primary neuron isolation, infection and culturing’), genotypes 
were confirmed on embryonic tail biopsies and leftover brains. These animals 
were mixed strains. 

Yap, Taz, Aridla and Nf2 conditional knockout mice were intercrossed with 
Alb*=8!? mice to obtain the different genotypes used for the experiments in Fig. 2 
and Extended Data Fig. 5 (including controls). These animals were mixed strains. 
For the induction of recombination in the liver, mice of the indicated genotypes 
(two months old) received one intraperitoneal injection per day of 3 mg tamoxifen 
(Sigma-Aldrich) dissolved in corn oil (Sigma-Aldrich) during five consecutive 
days. For the experiments depicted in Fig. 2d, e and Extended Data Fig. 5c—e, mice 
were euthanized four months after tamoxifen treatment. For the DDC experiments 
(Fig. 2f and Extended Data Fig. 5f-h), two weeks after tamoxifen treatment, mice 
were fed with either normal diet (Mucedola) or the same diet containing 0.1% DDC 
(Sigma-Aldrich) for six weeks (DDC diet; Mucedola). 

Statistics. The number of biological and technical replicates and the number of 
animals are indicated in figure legends, main text and Methods. All tested ani- 
mals were included. Animal ages are specified in the text and Methods. Sample 
size was not predetermined. Randomization was not applicable for our experi- 
ments with cell lines. Mice were randomly allocated to experimental or treatment 
groups to ensure equal sex/age across genotypes. Investigators were not blinded 
for analyses relying on unbiased measurements of quantitative parameters, with 
exception of pathological examination of histological sections carried out by 
M.E (a professional pathologist), who was blind to animal genotypes, sex/age 
or treatment. Data are mean + s.d. or mean +s.e.m. as indicated in the legends 
of the figures and extended data figures. Student’s t-test, Mann-Whitney U-test 
and one-way ANOVA analyses were performed with GraphPad Prism 7.0d for 
Mac software. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 
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Data availability 

Mass spectrometry data can be found in Supplementary Table 1. Source Data for 
Figs. 1, 2, 4 and Extended Data Figs. 2-5, 7, 8 can be found in the online version 
of the paper. Uncropped images of immunoblots can be found in Supplementary 
Fig. 1. All relevant data are included in the manuscript as Source Data or 
Supplementary Information; all other data are available from the corresponding 
authors upon reasonable request. 
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Extended Data Fig. 1 | See next page for caption. 
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Extended Data Fig. 1 | Interaction between YAP and the SWI/SNF 
complex. a, Proteomic analyses of endogenous YAP/TAZ-binding 
partners reveal interactions with endogenous components of the SWI/SNF 
complex (green). Red, the used bait. R1 and R2 are the results from 

n= 2 biologically independent samples. See Supplementary Table 1. 

b, YAP(5SA) was immunoprecipitated from lysates of MCF10A cells 
stably expressing Flag-tagged YAP(5SA) using an anti-Flag antibody, and 
co-precipitating endogenous components of the SWI/SNF complex were 
detected by western blot. As a negative control, immunoprecipitation (IP) 
was repeated with cells transduced with empty vector. GAPDH serves as 
a loading control for inputs (right). c, HEK293T cells were transfected 
with independent siRNAs against the indicated genes (ARID1IA in lanes 
5 and 6; BAF53A in lanes 7 and 8; SNF5 in lanes 9 and 10) and control 
siRNAs ((siCo) lanes 1-4) and with plasmids encoding HA-YAP(5SA) 
(all lanes) and Flag-~BRG1 (lanes 3-10), as indicated. Cell lysates were 
subjected to anti-Flag immunoprecipitation and co-precipitating proteins 
were checked by western blot. ARID1A depletion impairs the interaction 
between YAP and BRGI, but it had no effect on the association of BRG1 
with BAF53A (lanes 5 and 6). Depletion of BAF53A (lanes 7 and 8) or 
SNF5 (lanes 9 and 10) had no effect on the interaction between YAP 

and BRG1. ARID1A blot, top band represents the full-length ARIDIA. 
Input ARID1A was from a separate gel. d, Western blots of the inputs of 
the immunoprecipitation experiment shown in Fig. 1b. HEK293T cells 
were transfected with control (Co.) siRNA or siRNA against ARIDIA 
and with plasmids encoding HA-~YAP(5SA) and Flag-BRM, as indicated. 
e, Western blots of the inputs of the immunoprecipitation experiment 
shown in Fig. 1c. HEK293T cells were transfected with control siRNAs 
or with a siRNA mix against BRG1 and BRM. f, DNase-treated nucleus 
preparations from HEK293T cells were subjected to sequential salt 
extraction and fractions were analysed by western blot (left, lanes 1-4). 
The unsonicated, chromatin-free 0-mM NaCl fraction was incubated 
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with GST-YAP or GST protein (negative control), immobilized on a 
glutathione resin, and proteins that were pulled down were analysed by 
western blot (right, lanes 5 and 6). g, HEK293T cells were transfected 
with siRNAs against the indicated genes and with plasmids encoding 
HA-YAP(5SA) and Flag-BRGI1, as indicated. Cell lysates were subjected 
to anti-Flag immunoprecipitation and co-precipitating proteins were 
checked by western blot. h, Western blot of recombinant V5-ARID1A 
pulled down by GST-YAP or GST-TAZ, immobilized on a glutathione 
resin. GST protein was used as a negative control. Input, a fraction of 
V5-ARID1A used for the pull-down experiments. i, HEK293T cells were 
transfected with plasmids encoding empty vector (e.v.) or Flag-YAP(WT) 
or WW-domain mutants, as indicated. Cell lysates were subjected to 
anti-Flag immunoprecipitation and western blot analysis of endogenous 
ARID1A. GAPDH serves as a loading control in inputs. j, Flag-TAZ was 
immunoprecipitated from lysates of HEK293T cells transfected with Flag- 
tagged TAZ(WT) or TAZ(A WW) using an anti-Flag antibody, and co- 
precipitating endogenous ARID1A was detected by western blot only with 
TAZ(WT). As a negative control, immunoprecipitation was repeated using 
HEK293T cells transfected with empty vector. k, HEK293T cells were 
transfected with plasmids encoding Flag—-YAP(WT) (all lanes) and either 
V5-ARID1A(WT) or V5-ARID1A(PPxA) mutant, as indicated. Cell 
lysates were subjected to anti-Flag immunoprecipitation and western blot 
analysis of VS-ARID1A. We notice that other SWI/SNF components (such 
as BRG1 or SNF5) also carry PPxY motifs; although these components 
are by themselves not essential for the association with YAP/TAZ, the 
presence of a second WW motif in YAP (although not in TAZ) raises the 
possibility of stronger, cooperative associations between YAP and other 
elements of the SWI/SNF complex. b, c, f-k Panels display representative 
experiments, repeated independently two (c, f-k) or three (b) times, all 
with similar results. 
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Extended Data Fig. 2 | Effect of ARID1A depletion on YAP/TAZ 
levels, localization and activity. a, Results of luciferase assays with 

the 8x GTIIC-Lux reporter in HEK293 cells transfected with empty or 
YAP-expressing vectors and the indicated siRNAs. Data are normalized 
to control siRNA- and empty vector-transfected cells and are presented 
as mean + s.d. of n=3 biologically independent samples. b, Results 

of luciferase assays with the 8 x GTIIC-Lux reporter in HEK293 cells 
reconstituted with either YAP(WT) or YAP(WW1™) and transfected 
with the indicated siRNAs. Data are normalized to control siRNA- 
transfected cells and are presented as mean + s.d. of n=3 biologically 
independent samples. c, qPCR analyses of the YAP/TAZ targets ANKRDI1, 
CYR61 and PTX3 in MCF10A cells transfected as indicated. Data are 
mean + s.d. of n=3 biologically independent samples. d, Western blot 
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analysis of ARID1A, E-cadherin and vimentin from lysates of MCF10A 
cells transfected with the indicated siRNAs. e, qPCR analyses of CTGF 
(left) and ARID 1B (right) expression in MCF1O0A cells transfected as 
indicated. Data are mean + s.d. of n= 3 biologically independent samples. 
f, Representative confocal images (left) and quantifications (right; >100 
cells per condition) of YAP/TAZ localization in MCF10A cells transfected 
with the indicated siRNAs. g, Western blot analysis of YAP, TAZ and YAP 
phosphorylated at the key Hippo/LATS target site (p- YAP S127) in lysates 
of MCFIOAT cells transfected with the indicated siRNAs. P values were 
determined by unpaired two-sided Student's t-test; n.s., not significant. 
All panels display representative experiments, repeated independently 
two (d, e, g) or three (a—c, f) times with similar results. 
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Extended Data Fig. 3 | YAP and TAZ are required for the biological 
effects of SWI/SNF depletion in HMECs. a, b, HMECs were transduced 
with the indicated shRNA-encoding vectors and collected for protein 
extraction (a) or RNA extraction (b). a, Western blot of BRG1, TAZ 

and epithelial (ECAD) and mesenchymal (vimentin) markers. b, qPCR 
analyses of mesenchymal (TWIST1) and epithelial (KRT18) markers. Data 
are mean + s.d. of n=3 biologically independent samples. Continuation 
of Fig. 2a. c-e, HMECs were transduced with the indicated shRNA- 
encoding vectors and/or transfected with the indicated siRNAs and 
collected for RNA extraction. qPCR analyses of the indicate genes are 
shown. Data are mean + s.d. of n= 3 biologically independent samples. 


f, Mammospheres formed by HMECs transduced with the indicated 
shRNAs and transfected with indicated siRNAs. Data are mean + s.d. of 
n= 6 biologically independent samples. g, HMECs were transduced with 
the indicated shRNA-encoding vectors and analysed for their CD44 and 
CD24 immunophenotype. Quantification of the percentage of cells that 
displayed either a CD44™8CD24!W (stem-like mesenchymal cells) or 
CD44'~CD 24h! (differentiated epithelial cells) profile®. P values were 
determined by unpaired two-sided Student's t-test. All panels display 
representative experiments, repeated independently three times with 
similar results. 
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Extended Data Fig. 4 | SWI/SNF depletion potentiates YAP-induced 
reprogramming of neurons into NSCs. a, Efficiency of Brm and Aridla 
downregulation in neurons transduced with the indicated shRNA- 
encoding vectors, as measured by qPCR. Data are mean + s.d. of n=3 
biologically independent samples. A representative experiment repeated 
twice with similar results is shown. b, c, Related to Fig. 2c. Neurons 

were infected with doxycycline-inducible YAP-encoding vectors or 
empty vector and the indicated shRNA-encoding lentiviral vectors. 

b, Representative images of the cultures after 14 days in NSC medium with 
doxycycline. Scale bar, 300 1m. c, Quantification of the emerging (PO) 
neurospheres. Data are mean + s.e.m. of four independent experiments; 
*P = (0.03 for comparisons between YAP(WT)-expressing neurons 
transduced with control shRNA (shCo.) and Brm shRNAs (shBrm) or 
between YAP(WT)-expressing neurons transduced with control shRNA 
and Aridla shRNAs. d, e, Effect of Arid1a depletion on YAP-induced 


PO spheres 


YAPS94A YAPwt 


reprogramming of neurons. d, Syn1° drives Aridla knockout specifically 
in neurons as shown by genotyping. Genomic DNA from neurons was 
compared to genomic DNA from the tail of the same Syn1“°Aridla*+ 
mouse. PCR bands are shown for the indicated alleles. e, Control 
(Aridlat!*) and Aridla*t!~ (from Syn1°*Arid1la!* mice) neurons were 
infected with inducible YAP-encoding vectors. Left, Representative images 
of PO neurospheres that emerged from these cultures after doxycycline 
treatment in NSC medium. Scale bar, 300 1m. Right, quantification of 

PO neurospheres that emerged from these cultures after doxycycline 
treatment in NSC medium. Data are mean + s.e.m. of four independent 
experiments. YAPS94A serves as negative control. e complements Fig. 2c 
and Extended Data Fig. 4b, c, which show comparable results between 
shRNA and genetic attenuation of Aridla. P values were determined by 
unpaired two-sided Student's t-test (a) and by two-sided Mann-Whitney 
U-test (c, e). 
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Extended Data Fig. 5 | Effect of ARID1A depletion in hepatocytes on 
tumour formation. a, qPCR analysis of Nf2 and Arid1a expression in 

the livers of control (n =6 mice), and Nf2 (n=6 mice), Aridla (n=7 
mice) and Nf2/Aridla (n=7 mice) liver mutant (LKO) mice, four months 
after tamoxifen treatment. All animals were included. Mean and data for 
individual mice are shown. b, Livers of control (Aridla™") and Aridla 
LKO (Alb°°F®” Arvid1al") mice were collected two weeks after tamoxifen 
treatment, and genomic DNA and proteins were extracted using standard 
procedures. Representative results are shown, experiments were repeated 
on four mice for each genotype. Left, PCR analysis of the indicated 
alleles. Right, western blots of GAPDH (loading control) and ARIDIA. 

c, YAP immunohistochemistry (IHC) staining in control and Nf2 mutant 
livers. Scale bars, 40 |1m. Representative images of experiments that 

were independently replicated using three mice for each genotype, with 
similar results. d, Continuation of Fig. 2e. Representative cytokeratin 
(CK; top) and Ki-67 (bottom) stainings of sections of livers of the 
indicated genotypes (same genotypes as in Fig. 1d, e and Extended Data 
Fig. 5a). Note intrahepatic cholangiocarcinomas (iCCA; CK*Ki-67*) and 
hepatocellular carcinomas (HCC; Ki-67*CK_ ) were found only in livers 
from Nf2/Aridla LKO mice. Scale bars, 100 1m. Representative images 
are shown, experiments independently replicated for all of the mice of 
each genotype described in a, with similar results. e, qPCR analysis of 
selected genes of livers of mice with the indicated genotypes. All animals 
were included. Data are normalized to Nf2/Aridla LKO mice. Data are 


mean + s.d. for same number of mice per genotype as in a. f, Continuation 
of Fig. 2f. Control, Aridla LKO and Arid1a/Yap/Taz LKO mice were 
treated with tamoxifen and were then fed a DDC-containing diet for six 
weeks. CK (top; scale bars, 40 1m) and Ki-67 (bottom; scale bars, 20 jum) 
stainings of liver sections from the indicated mice. Note the presence 

of early cholangiocarcinoma lesions (CK*Ki-67*) in the Aridla LKO 

mice and their absence upon concomitant YAP/TAZ loss (that is, in the 
Arid1la/Yap/Taz LKO mice). Asterisks indicate porfirin deposits, which 

are typically present in the liver of mice treated with DDC. Representative 
images are shown, experiments were independently replicated for all of the 
mice of each genotype (same number of mice as in Fig. 2f), with similar 
results. g, Representative qPCR analysis of Afp expression in the livers of 
control (n =4), Aridla LKO (n=5), Aridla/Yap/Taz LKO (n=5) mice 
treated with tamoxifen and then DDC. Data are normalized to livers of 
mice not treated with DDC (n= 4). Data are mean + s.d. of the indicated 
number of mice. This experiment was independently repeated three 

times with similar results, analysing, in total, at least 10 mice for each 
genotype. h, Representative E-cadherin staining showing that CCA lesions 
retain an epithelial morphology in sections of the liver of the indicated 
genotype. Scale bar, 301m. Experiments were independently repeated on 
three DDC-treated Aridla LKO mice, with similar results. P values were 
determined by one-way ANOVA with Dunnett’s multiple comparisons test (a) 
or with Tukey’s multiple comparisons test (e, g). 
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Extended Data Fig. 6 | Interaction of SWI/SNF with F-actin and 

YAP is mutually exclusive. a, Related to Fig. 3a. HEK293T cells were 
transfected with Flag-NLS-6-actin(R62D). Representative anti-Flag 
immunofluorescence images to visualize transfected Flag~-NLS-(-actin. 
Nuclei were counterstained with DAPI. Scale bar, 101m. b, c, Related to 
the PLAs shown in Fig. 3b. b, Negative controls for the PLA of Fig. 3b: in 
the absence of one of the two partners, no dots can be seen. c, In HEK293T 
cells, by PLA, endogenous BRM interacts with Flag-tagged NLS-6- 
actin(WT), but not with Flag-tagged NLS-(-actin(R62D), indicating that 
the association is specific to filamentous, and not monomeric, }-actin. 

d, Western blots of the inputs of the experiment shown in Fig. 3c. 

e, Sequential salt extraction of HEK293T cells treated with either 
phalloidin (Phall) or latrunculin A (Lat.A). Western blots of the indicated 
proteins are shown. H3 was loaded on a different blot. f, Western blots 

of the inputs of the experiment shown in Fig. 3d. MCFI10AT cells were 
transfected with control siRNAs (siCo., lanes 1 and 2) or siRNAs against 
ARIDIA (si1A; lane 3) and treated with phalloidin (lane 1) or latrunculin 


A (lanes 2 and 3), as indicated. g, Continuation of Fig. 3e. A PLA was 
carried out to detect the interaction between endogenous BRM and NLS- 
YAP in MCF10A cells. Control untreated cells, 0% PLA-positive cells; cells 
treated with the Src inhibitor dasatinib (that is, a low-mechanics 
condition in addition to those shown in Figs. 3e), 14.5% PLA-positive 
cells. h, Co-immunoprecipitation and western blot analysis of MCF10AT 
lysates showing endogenous ARID1A bound to endogenous YAP but not 
to TEAD1 and TEAD4. As a specificity control, immunoprecipitation with 
unrelated rabbit IgG was repeated using the same lysates. i, j, Related to 
Fig. 3f. i, Representative PLA images detecting the interaction between 
endogenous TEAD and NLS-YAP in MCFI1O0A cells. The YAP-TEAD1 
association is lost in C3-treated cells (that is, in cells with attenuated 
mechanotransduction (low mechanics) upon C3-mediated inhibition 

of RhoGTPases), but rescued after depletion of ARID1A (PLA-positive 
cells: 43.4%). j, Specificity controls of single antibodies for the PLA shown 
in iand in Fig. 3f. a—c, e, g-j are representative experiments, repeated 
independently two (e, h) or three (a-c, g, i, j) times, with similar results. 
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Extended Data Fig. 7 | Loss of SWI/SNF restores YAP/TAZ 
transcriptional activity in mechanically inhibited cells. a, Representative 
confocal images (left) and quantification (right; >100 cells per conditions) 
of YAP/TAZ localization in MCF10A cells transfected with the indicated 
siRNAs and replated on a soft ECM. b, MCF10A cells were transfected 
with the indicated siRNAs, and left untreated (control) or treated with 
anti-integrin-61 antibodies, the Rho-inhibitors C3 and cerivastatin, the 
Src-inhibitor dasatinib or the ROCK inhibitor fasudil. qPCR analyses of 
CTGF expression (mean + s.d. of n = 3 biologically independent samples). 
Anti-integrin-31 and fasudil were part of the same experiment and thus 
share the same control repeated in their corresponding graphs. c, HaCaT 
cells were transfected with the indicated siRNAs and replated to obtain 


si 


either sparse (high mechanics) or dense monolayers (low mechanics). 
qPCR analyses of CTGF expression. Data are mean + s.d. of n=3 
biologically independent samples. d, Efficiency of Aridla downregulation 
in Aridla™" fibroblasts after transduction with Adeno-Cre, measured by 
qPCR (data are normalized to adeno-GFP-transduced cells and presented 
as mean + s.d. of n=3 biologically independent samples) and western blot 
(in which GAPDH was used as a loading control). e, MCF10A cells were 
transfected with the indicated siRNAs and replated at very high density 
(see Methods). qPCR analyses of CTGF expression. Data are mean + s.d. 
of n =3 biologically independent samples. All panels display representative 
experiments, repeated independently three times with similar results. 

P values were determined by unpaired two-sided Student's t-test. 
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Extended Data Fig. 8 | Loss of SWI/SNF enables YAP-induced biological 
effects in mechanically inhibited cells. a, MCF10A cells were transfected 
with the indicated siRNAs, and replated to obtain dense monolayers (low 
mechanics). After 24 h, cells were incubated for 1 h with a pulse of EdU to 
label cells undergoing DNA duplication. Cells were fixed and processed 
for EdU staining. Quantification of proliferation was measured as the 
relative number of EdU* cells. Data are normalized to sparse cells (high 
mechanics) transfected with control siRNA. Data are mean + s.e.m. 

of at least n = 3 biologically independent samples. Statistics for rescue 
experiments at low mechanics: control siRNA (= 3) versus siBRM/BRG1 
mix A (n=3), P=0.0003; control siRNA versus siBRM/BRG1 

mix B (n= 3), P=0.0005; control siRNA versus siARID1A#1 (n=3), 
P=0.04; control siRNA versus siARID1A#2 (n= 3), P=0.002. A 
representative experiment is shown, experiments were repeated 
independently twice with similar results. b, Neurons were plated on a 

stiff or soft ECM and infected with inducible YAP-encoding vectors. 
Quantification of neurospheres emerging from these cultures after 
doxycycline treatment in NSC medium. Data are mean + s.e.m. of all 
biological independent samples of three experiments, n = 9. c, d, Related 


to Fig. 4d. Neurons were plated on a soft ECM and infected with inducible 
YAP-encoding vectors or empty vector and the indicated shRNA-encoding 
lentiviral vectors. c, d, Representative images (c) and quantification 

(d) of neurospheres (PO) emerging after doxycycline treatment. Scale 

bar, 300m. Data are mean + s.e.m. of four independent experiments. 

* P= 0.03, control shRNA (shCo) versus Brm shRNA (shBrm#1 or 
shBrm#2) in neurons transduced with YAP(WT); P= 0.03, control 

shRNA versus Aridla shRNA (shAridla#1 or shAridla#2) in neurons 
transduced with YAP(WT). e, Fold change in expression in neurospheres 
emerging from cultures of YAP-induced neurons transduced with the 
indicated shRNAs against Brm or Arid1a, and plated either on a stiff 

(high mechanics) or soft (low mechanics) ECM, with respect to the 
corresponding control shRNA-expressing cultures. Data are mean + s.e.m. 
of four independent experiments. *P = 0.03, for comparisons between 
Brm or Aridla shRNA under high mechanical conditions and the 
corresponding samples under low mechanical conditions. P values were 
determined by unpaired two-sided Student's t-test (a) and by two-sided 
Mann-Whitney U-test (b, d, e). 
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Cryo-EM reveals two distinct serotonin-bound 
conformations of full-length 5-HT3, receptor 


Sandip Basak!, Yvonne Gicheru!, Shanlin Rao’, Mark S. P. Sansom? & Sudha Chakrapani 


The 5-HT3, serotonin receptor’, a cationic pentameric ligand-gated 
ion channel (pLGIC), is the clinical target for management of nausea 
and vomiting associated with radiation and chemotherapies”. Upon 
binding, serotonin induces a global conformational change that 
encompasses the ligand-binding extracellular domain (ECD), the 
transmembrane domain (TMD) and the intracellular domain (ICD), 
the molecular details of which are unclear. Here we present two 
serotonin-bound structures of the full-length 5-HT3, receptor in 
distinct conformations at 3.32 A and 3.89 A resolution that reveal the 
mechanism underlying channel activation. In comparison to the apo 
5-HT3, receptor, serotonin-bound states underwent a large twistin: 
motion in the ECD and TMD, leading to the opening of a 165 
permeation pathway. Notably, this motion results in the creation 
of lateral portals for ion permeation at the interface of the TMD 
and ICD. Combined with molecular dynamics simulations, these 
structures provide novel insights into conformational coupling 
across domains and functional modulation. 

Recent high-resolution pLGIC structures have highlighted several 
fundamentals of the gating mechanism*’. However, there is a lack of 


Fig. 1 | Ion permeation pathway. a, The profile of the ion permeation 
pathway for the full-length 5-HT3, receptor in the apo state (salmon red) 
and in the two serotonin-bound conformations, state 1 (teal) and state 2 
(yellow). The same colour scheme is used to represent the three states in 
all subsequent figures. For clarity, the cartoon representation is shown 
only for two subunits. Green and blue spheres define radii of 1.8-3.3 A and 


1,3% 


information on conformational coupling between the different domains 
and, particularly, how the ICD modulates the overall channel function. 
The ICD has a role in localization to the plasma membrane’ and reg- 
ulates single-channel conductance, rectification and gating”!”. We 
recently reported the structure of the full-length mouse 5-HT3, recep- 
tor (Extended Data Table 1a) in the unliganded (apo) state!!, solved by 
single-particle cryo-electron microscopy (cryo-EM), which revealed 
key features of the resting conformation. In the present study, we deter- 
mined structures of the 5-HT3, receptor in the presence of 100 1M 
serotonin by cryo-EM to gain insights into the mechanism of activation 
by serotonin. High-resolution data collection and processing revealed 
two distinct populations of the receptor with final 3D reconstructions 
to overall resolutions of 3.32 A and 3.89 A, which we will refer to as 
state 1 and state 2, respectively (Extended Data Figs. 1, 2, Extended Data 
Table 1b). The map for both states contained density for the entire ECD, 
TMD anda large region of the ICD (Extended Data Fig. 3), and the 
overall 3D architectures were similar to the apo structure!. 

The apo, state 1 and state 2 structures reveal distinct conforma- 
tions of the pore (Fig. 1a, b). The apo pore is constricted at multiple 
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>3.3 A, respectively. The position at which the pore is constricted below 
2.76 A in the apo state is shown as sticks. b, The pore radius is plotted as 

a function of distance along the pore axis. The dashed line indicates the 
approximate radius of a hydrated Nat ion”. c, Side (left) and top (right) 
views of the pore-lining M2 helices showing superposition of the apo, state 
1 and state 2 structures. 
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Fig. 2 | The serotonin-binding site and global conformational 
differences between the apo and serotonin-bound states. a, Top, the 
state 1 map around the side chains of residues at the subunit interface 

that constitute the serotonin-binding site (top left, contoured at 100); 

the density map for serotonin in state 1 (top right, contoured at 7.50). 
Bottom, the state 2 density map for the same residues (bottom left, 97) and 
serotonin (bottom right, 7.50). b, A comparison of the ECDs of the apo 
structure with state 1 (left) and state 2 (right) when aligned with respect to 
the TMD. Arrows indicate the direction of displacements between the two 
structures. c, A view of the TMDs of the (—) subunit from the extracellular 
end when aligned with respect to the ECDs of the (+) subunit for state 1 
and apo (left), and for state 2 and apo (right). 


locations along the permeation pathway (Lys108 and Asp105 in the 
ECD, Leu260 (L9’) and Glu250 (E—1’) in the TMD, and Arg416 in the 
ICD) to radii below approximately 2.76 A (the radius of a hydrated Nat 
ion)”, reflecting its non-conductive conformation. The numbering in 
parentheses refers to the residue positions within the pore-lining M2 
helices. Whereas Leu260 in the middle of the M2 helix forms part of 
the activation gate'’, Glu250 at the intracellular end forms the selec- 
tivity filter‘, Compared to the apo pore, state 1 exhibits an expansion 
of the pore at each of these constriction points. The radius at Leu260 
is approximately 3.0 A and the pore within M2 is narrowest at Ser253 
(S2’) (approximately 2.7 A), suggesting that these locations may impede 
permeation of hydrated Na* ion. State 2 has the widest pore among the 
three structures, with an internal radius larger than 3.2 A, notably at 
positions below Leu260 and extending all the way into the ICD, indi- 
cating a potentially conductive conformation. In comparison to the 
apo structure, the M2 helices are rotated clockwise by 7.5° in state 1 
and by 13° in state 2, and positioned outward (Fig. 1c) with the Leu260 
side chains rotated away from the pore axis. Additionally, in state 2, 
the helix is bent at Ser253 and tilted 25° away from the pore axis at the 
level of Glu250, thereby creating a wider vestibule at the intracellular 
end of M2. 


LETTER 


State 1 and state 2 reveal a distinct density for serotonin at the neuro- 
transmitter-binding site located at the interface of two adjacent subunits 
(Fig. 2a, right). Residues from loops A, B and C on the principal (+) 
subunit and loops D, E and F from the complementary (—) subunit!>:'° 
form a cage-like enclosure for serotonin (Fig. 2a, left). In state 1 and 
state 2, loop C is in a closed position in comparison to the outward 
or ‘oper’ orientation of loop C in the apo state (Fig. 2b), consistent 
with agonist-bound conformations of the acetylcholine-binding pro- 
tein’”, Several interactions between serotonin and binding-site residues 
(Trp156, Arg65 and Trp63) have previously been proposed'®'*; these 
residues are within 4 A of serotonin in state 1 and state 2. 

A comparison with the apo structure reveals a global twisting of the 
ECD and TMD in the serotonin-bound states (Extended Data Fig. 4a). 
There is an overall counter-clockwise rotation of the ECD around the 
pore axis, leading to major repositioning of individual interfacial 
loops (Fig. 2b, Extended Data Fig. 4b), similar to other pLGICs*”°. As 
a result, buried areas between adjacent subunits are reduced in state 1 
(3,096 A?) and state 2 (2,533 A?) compared to the apo state (3,161 A?). 
This change is also reflected in decreasing inter-subunit interactions at 
the ECD-TMD and TMD-ICD interfaces from the apo state to state 
1, and from state 1 to state 2 (Extended Data Fig. 5). At the level of the 
TMD, serotonin induces a clockwise rotation with an expansion of 
the transmembrane helices away from the pore axis (Extended Data 
Fig. 4a (bottom), c). An outward displacement of M2 is accompanied 
by a marked outward movement of the M2-M3 loop away from the 
inter-subunit interface (Fig. 2c), which reduces its interactions with the 
pre-M1 region and the 88-89 loop in the neighbouring subunit relative 
to the apo state (Extended Data Fig. 5a). 

The most notable difference among the three structures is the 
conformation of the ICD, which is comprised largely of the M3-M4 
linker. In state 1 and the apo state, the membrane-associated (MA) 
helix’! appears as a straight helix extending into M4. In state 2, the 
MA-M64 helix is bent (20° with respect to the MA helix) in the vicin- 
ity of Gly430, and appears as two separate helices that are tilted away 
from the pore axis* (Fig. 3a, Extended Data Fig. 6), thereby enlarging 
the central cavity at the TMD-ICD interface (Fig. 1a). Gly430 may 
introduce greater flexibility at the hinge point between the MA and 
M4 helices. In the apo state, the ion-exit pathways are occluded at two 
different levels: the post-M3 loop obstructs the lateral portals lined 
by MA-M4 helices (Extended Data Fig. 5b), and the MA helices form 
a tight bundle that sterically occludes ion permeation along the pore 
axis (Fig. 1b, Extended Data Fig. 6). Whereas there are small confor- 
mational changes in these regions in state 1 relative to the apo state, 
there are much larger differences in state 2; the post-M3 loop extends 
away from the MA-M4 helix, creating lateral portals with openings of 
dimensions 16.0 A x 11.4 A (Fig. 3b, c). These portals are large enough 
to accommodate hydrated Na* ions and may serve as exit pathways for 
permeant ions (Fig. 3b), consistent with the early hypotheses made on 
the basis of studies on the nicotinic acetylcholine receptor (nAChR)! 
and the 5-HT3, receptor’. The MX helix in the apo state and state 1 
lies parallel to the putative membrane-water interface. In state 2, it 
is displaced upward from the interface and, consequently, pulls the 
post-M3 loop away from the lateral portal. Additionally, the outward 
movement of the MA helix disrupts the tight packing of the helical 
bundle structure, thereby widening the pore in this region. 

The electrostatic-potential maps show that, whereas the ion perme- 
ation pathway in the ECD and the TMD is lined with predominantly 
electronegative side chains (Fig. 3c), the ICD is lined with clusters of 
positively charged residues on the MA helix. In state 2, the entrances to 
the lateral portals are lined with three key arginine residues (Arg416, 
Arg420 and Arg424) from the MA helix (Extended Data Fig. 6a), which 
are reported to be responsible for the unusually low single-channel con- 
ductance of the 5-HT3, receptor (0.4—0.6 pS)? through mechanisms 
involving steric occlusion and electrostatic repulsion”. As expected, 
mutations to the Arg side chain markedly increase single-channel con- 
ductance (up to 40-fold)?. Previous studies have demonstrated that 
both the length and the sequence of the M3-M¢4 linker have substantial 
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Fig. 3 | Opening of the lateral portal for ion exit. a, An alignment of 
apo, state 1 and state 2 structures. The TMD and ICD are shown for two 
adjacent subunits. The arrows show the direction of relative movements 
of the helices. b, The solvent-accessible vestibule in the ICD of state 2 was 
calculated with a minimum cavity radius of 2.8 A. One of the subunits 

in the surface representation is removed for clarity. The plausible ion- 
exit pathways are indicated by dotted arrows. c, The solvent-accessible 
electrostatic potential map generated using the APBS tool. The inset shows 
a zoomed-in view of the ICD to highlight the progressive opening of the 
lateral portal from the apo structure to state 2. Residues Arg416, Arg420 
and Arg424 (shown in stick representation) are implicated in regulating 
the single channel conductance of the 5-HT3, receptor. 


effects on channel function’®*, and several positions in the MA helix 


regulate single-channel conductance”, inward rectification”, gating 
and desensitization!®”®”. Collectively, and in the light of the structures 
presented here, these studies underscore the role of the ICD in many 
aspects of channel function. 

To assess the conductance in the apo state, state 1 and state 2, we per- 
formed molecular dynamics simulations with the structures inserted 
into a phospholipid bilayer (Fig. 4). An analysis of simulated water den- 
sity along the pore axis suggests that the apo structure is closed to water, 
with two hydrophobic constrictions that are de-wetted: one is at about 
—60 A in the ICD, lined by hydrophobic residues Leu402, Leu406 and 
1le409 on the MA helix, and the other is at about 0 A around Leu260 
in M2. In simulations performed with a transmembrane potential, no 
permeation events were observed for Na” ions in this conformation. 
In state 1, similar energetic barriers for water were present along the 
pore. However, a small number of Nat ions were observed to traverse 
the channel when a transmembrane potential difference was added 
to the simulation. State 2, on the other hand, did not present a barrier 
for water within the TMD, and Nat permeation events were observed 
throughout the simulations. However, the hydrophobic region at —60 A 
was almost entirely de-wetted, and Na‘ ions failed to permeate this 
region. Instead, the ions exited the ICD through the lateral portals, 
consistent with predictions that these regions serve as exit pathways 
for ions?!” (Supplementary Video 1). 
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Fig. 4 | Molecular dynamics simulations of apo, state 1 and state 2 
structures. a, b, Each structure was subjected to three 50-ns equilibrium 
simulations, with the replicates initiated from separately assembled 
protein-membrane systems. Radius and energy profiles were calculated 
for each simulation (using the final 40 ns of the trajectory, with a sampling 
interval of 0.5 ns) and averaged across replicates. For each structure, the 
mean profile and the one-standard-deviation range between calculations 
(n=3) is shown as grey band. a, Mean radius along the central pore axis. 
The dashed line indicates the approximate radius of a hydrated Na* ion. 
b, Corresponding free-energy profiles of a water molecule along the 
central pore axis. c, Trajectories of water and Na* ion coordinates 
within 5 A of the channel axis inside the pore over 100 ns with a 0.2-V 
transmembrane potential difference, with the cytoplasmic side having a 
negative potential. White stretches indicate regions devoid of water and 
ions. The energetic barriers due to the ring of Leu260 and Ile409 are at 
z~0Aandz~ —60A, respectively. One of three independent 100-ns 
replicates is shown for each structure. 


Overall, these findings reveal that the apo state is non-conductive, 
whereas state 2 represents a conductive conformation. Brief and infre- 
quent permeation events observed for state 1 suggest that it has low 
conductance. Closer examination of state 1 shows that the pore is 
de-wetted when the Leu260 side chains point inwards (as seen in the 
apo state) and the pore is hydrated when the side chains point away 
(Extended Data Fig. 7). Of note, Glu250 side chains also underwent 
considerable shifts in conformation that led to changes in the local pore 
radius, which was sometimes reduced to approximately 2.4 A when 
the Glu side chains were pointing inwards (Extended Data Fig. 8). 
Therefore, on the basis of these analyses we conclude that, although 
state 1 is mostly non-conductive, side-chain fluctuations may allow a 
small conductance in this state. 

A comparison with representative pLGIC conformational states 
shows that in the resting conformation the hydrophobic extracellular 
half of M2 forms the activation gate—with 9’ being the narrowest 
region—and in the desensitized conformation, the intracellular end 
is constricted (Extended Data Fig. 9). However, notable differences 
in the extent of constrictions may reflect the inherent differences in 
gating kinetics, origin or perhaps the nature of biochemical modifi- 
cation involved in determining these structures. Leu260 (9’) is con- 
served across pLGICs, and our finding of the key role of this residue 
in pore constriction and hydration is consistent with its role in gating 
and desensitization?®*”. At the intracellular end, Ser253 (2’) lines 
the narrowest region in state 1, and the S253T mutation results in 
serotonin-induced currents (10 .M) with unusually slow kinetics of 
decay (Fig. 5a, b). The narrow pore at this position is also consistent 
with Cd**-coordination studies*!. Glu250 (—1’) is positioned at 
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Fig. 5 | Functional characterization of mutations in pore-lining 
residues. a, Two-electrode voltage-clamp (TEVC) recording (at —60 mV) 
of wild-type (WT) 5-HT3, receptor and G249A, E250D and S253T 
mutants, expressed in oocytes. Currents were elicited in response to 10 1M 
serotonin (duration of serotonin application is shown as an orange line). 
b, The mutated residues in M2 (ribbons) are shown as sticks on two 
subunits. c, A plot of the ratio of current measured at f = 20 s over the 
peak current amplitude. The number of individual oocyte recordings is 
indicated in parenthesis. Data are shown as mean + s.d. ***, statistically 
significant. E250D, P= 0.0003; $253T, P=9.6 x 10711; H309S, P= 0.0002. 
Two-sample t-test for mutants and wild type at 95% confidence level. d, 
Top, interaction between Glu250 and His309 from two adjacent subunits 
as seen from the extracellular side (top). TEVC recording of current 
induced by 10 .M serotonin for the H309S mutant (bottom). 


the narrowest region in the nAChR structure captured in a desensi- 
tized conformation’, and the side chain shows extensive fluctuations 
in the simulations of state 1. A charge-preserving mutation at 
this position (E250D) led to enhanced desensitization (Fig. 5a, c). 
In state 2, Glu250 appears to potentially interact with His309 in M3 
from the adjacent subunit (Fig. 5d), and the H309S mutant also shows 
rapidly desensitizing currents (Fig. 5c, d, bottom). There was no nota- 
ble change in the desensitization of G249A, consistent with findings 
that the M1-M2 linker has relatively minimal effect on desensitiza- 
tion?”. Overall, these results are in agreement with the idea that the 
intracellular end of M2 has an important role in ion selectivity and 
gating. 

Together, the apo and serotonin-bound 5-HT3, receptor struc- 
tures provide many insights into the activation mechanism. Serotonin 
induces a global twisting in the ECD, TMD and ICD, leading to reduced 
inter-subunit interactions and larger solvent-exposed surfaces. The 
state 2 conformation features large displacements in the ICD that widen 
the central cavity and open ion-exit pathways at the lateral portals and 
along the pore axis. The overall pore conformation suggests that state 2 
is likely to represent a conductive, open state. We cannot unequivocally 
assign a functional state to the state 1 conformation, and it is unclear 
whether it corresponds to a pre-open, non-conducting intermediate or 
a desensitized state. Further studies are needed to further evaluate this 
conformation and determine the significance of potential intermediate 
states in channel gating. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Cloning and functional measurements in oocytes. The gene encoding the mouse 
5-HT3, receptor (purchased from GenScript) was inserted into the pTLN vector 
for Xenopus laevis oocyte expression and confirmed by DNA sequencing. DNA 
linearization was carried out by incubation with the Mlu1 restriction enzyme 
overnight at 37°C. mRNA synthesis was done using the mMessage mMachine kit 
(Ambion) as per the manufacturer’s instructions. The eluted mRNA was purified 
with RNAeasy (Qiagen), and injected (3-10 ng) into Xenopus laevis oocytes (stages 
V-VI). As a control to verify that no endogenous currents were present, oocytes 
were injected with the same volume of water. The oocytes used in this study were 
kindly provided by W. F. Boron. Female Xenopus laevis were purchased from Nasco. 
All animal experimental procedures were approved by Institutional Animal Care 
and Use Committee (IACUC) of Case Western Reserve University. Oocytes were 
maintained at 18°C in OR3 medium (GIBCO-BRL Leibovitz medium contain- 
ing glutamate, 500 units each of penicillin and streptomycin, pH adjusted to 7.5, 
osmolarity adjusted to 197 mOsm). Two-electrode voltage-clamp experiments were 
performed at room temperature 2-5 days after injection on a Warner Instruments 
Oocyte Clamp OC-725. The currents were sampled and digitized at 500 Hz with a 
Digidata 1332A and analysed by Clampfit 10.2 (Molecular Devices). Oocytes were 
clamped at a holding potential of —60 mV, and currents were recorded in response 
to serotonin application. Solutions were changed using a syringe pump perfusion 
system flowing at a rate of 6 ml/min. The electrophysiological solutions contained 
96 mM NaCl, 2 mM KCl, 1.8 mM CaCh, 1 mM MgCh, and 5 mM HEPES (pH 7.4, 
osmolarity adjusted to 195 mOsM). All chemical reagents were purchased from 
Sigma-Aldrich. For wild type and mutants, the current decay was assessed by the 
ratio of the current measured at time = 20 s (from the start of ligand application) 
over peak current amplitude. 

Cloning and transfection. Codon-optimized mouse Htr3a gene (NCBI Reference 
Sequence: NM_001099644.1) was purchased from GenScript and subcloned into 
pFastBacl vector. The pFastBacl vector includes four strep-tags (WSHPQFEK) 
at the N terminus, followed by a linker sequence (GGGSGGGSGGGS) and a 
TEV-cleavage sequence (ENLYFQG). The construct also includes a C-terminal 
1D4-tag**. Spodoptera frugiperda cells (Sf9, Invitrogen) were cultured in ESF921 
medium (Expression Systems) in the absence of antibiotics and incubated at 28°C 
without CO, exchange. Sub-confluent cells were transfected with recombinant 
bacmid DNA using Cellfectin II transfection reagent (Invitrogen) per manufacturer 
instructions. The cell culture supernatants were collected and centrifuged at 1,000g¢ 
for 5 min to remove cell debris to obtain progeny 1 (P1) recombinant baculovirus 
72 h post-transfection. P2 viruses were obtained through consecutive rounds of 
Sf9 cells infection with P1 viruses. The supernatants (P2) were used to infect Sf9 
cells, thereby generating P3 viruses. These viruses (P3) were used for recombinant 
protein production. 

Expression and purification of recombinant protein. Recombinant protein pro- 
duction was performed by infection of approximately 2.5 x 10° per ml Sf9 cells with 
P3 recombinant viruses. After 72 h post-infection, the cell medium was collected and 
centrifuged at 8,000g for 20 min at 4°C to separate the supernatant from the pellet. 
The cell pellet was then resuspended in 20 mM Tris-HCl, pH 7.5, 36.5 mM sucrose, 
and 1% protease inhibitor cocktail. Cells were disrupted by sonication on ice and 
non-lysed cells were removed by centrifugation (3,000g for 15 min). The membrane 
fraction was separated by ultracentrifugation (167,000g for 1 h) and solubilized with 
1% C)2Es in a buffer containing 500 mM NaCl, 50 mM Tris pH 7.4, 10% glycerol 
and 0.5% protease inhibitor by rotating for 2 h at 4°C. Non-solubilized material was 
removed by ultracentrifugation (167,000g for 15 min). The supernatant was collected 
and bound with 1D4 beads pre-equilibrated with 150 mM NaCl, 20 mM HEPES 
pH 8.0 and 0.01% CjEy for 2 h at 4°C. The beads were then washed with 100 
column volumes of 150 mM NaCl, 20 mM HEPES pH 8.0, and 0.01% C)2Es (buffer A). 
The protein was then eluted with buffer A supplemented with 3 mg/ml 1D4 pep- 
tide (TETSQVAPA). Eluted protein was then concentrated and deglycosylated with 
PNGase F (NEB) by incubating 5 units of the enzyme per 1 j1g of the protein for 
2 hat 37°C under gentle agitation. Deglycosylated protein was then applied to a 
Superose 6 column (GE healthcare) equilibrated with buffer A. The peak fractions 
around 13.9 ml were pooled and concentrated to 2-3 mg/ml using 50-kDa MWCO 
Millipore filters (Amicon) and used subsequently for cryo-EM studies. 

Sample preparation and cryo-EM data acquisition. Functional characterization 
shows that serotonin-induced 5-HT3, receptor currents saturate at 30 |1M and 
beyond!+34, Therefore, the 5-HT3, receptor protein (~2.5 mg/ml) was filtered 
and first incubated with 100 \.M serotonin for 30 min. After which, 3 mM fluor- 
inated Fos-choline-8 (Anatrace) was added and the sample was incubated until 
blotting®®. The sample was blotted twice with 3.5 jl sample each time onto Cu 300 
mesh Quantifoil 1.2/1.3 grids (Quantifoil Micro Tools), and immediately after the 
second blot, the grid was plunge frozen into liquid ethane using a Vitrobot (FEI). 
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The grids were imaged using a 300 kV FEI Titan Krios microscope equipped with 
a Gatan K2-Summit direct detector camera. Movies containing 40 frames were 
collected at 130,000 magnification (set on microscope) in super-resolution mode 
with a physical pixel size of 0.532 A/pixel, dose rate of 3.754 electrons/pixel/s, and 
a total exposure time of 12 s. Defocus values of the images ranged from —1.0 to 
—2.5 um (input range setting for data collection) as per the automated imaging 
software Latitute S (Gatan). 

Image processing. Beam-induced motion was corrected using MotionCor*® with 
a B-factor of 150 pixels”. Super-resolution images were binned (2x 2) in Fourier 
space, making a final pixel size of 1.064 A. All subsequent data processing was con- 
ducted in RELION 2.1°”. The defocus values of the motion-corrected micrographs 
were estimated using Gctf software**. Approximately, 3,000 particles were man- 
ually picked from the 2,810 micrographs and sorted into two-dimensional (2D) 
classes. The best of these classes were then used as templates for automated particle 
picking. A loose auto-picking threshold was selected to ensure no good particles 
were missed at this stage. This resulted in ~749,970 auto-picked particles that were 
subjected to 2D classification to remove suboptimal particles. An initial 3D model 
was generated from the apo-5-HT3, receptor cryo-EM structure (RCSB Protein 
Data Bank code (PDB ID): 6BE1) and low-pass filtered to 60 A using EMAN2”. 
Multiple rounds of 3D auto-refinements and 3D classifications generated 5 good 
classes. Among them two classes (containing total of 115,992 particles) belonged 
to state 1 and the other three classes (containing a total of 25,547 particles) repre- 
sented the state 2 conformation. Subsequent 3D re-classifications, auto-refinement, 
imposing Cs symmetry, and post-processing yielded state 1 and state 2 5-HT3, 
receptor structures with final total particles of 103,698 and 18,839, respectively. 
In the post-processing step in RELION, a soft mask was calculated and applied to 
the two half-maps before the Fourier shell coefficient (FSC) was calculated. The 
B-factor estimation and map sharpening were performed in the post-processing 
step. An overall resolution of state 1 and state 2 was calculated to 3.32 A and 3.89 A, 
respectively (based on the gold-standard FSC = 0.143 criterion). Local resolutions 
were estimated using the RESMAP software“. 

5-HT3, receptor model building. The map for state 1 and state 2 contained den- 
sity for the entire ECD, TMD and a large region of the ICD. The final refined 
models comprised of residues Thr8-Ile332 and Leu397-Ser461. The missing 
region (333-396) is of the unstructured MX loop that links the amphipathic MX 
helix® and the MA helix”!. The apo-5-HT3, receptor cryo-EM structure (PDB ID: 
6BE1) was used as an initial model and aligned to the 5-HT3, receptor cryo-EM 
map calculated with RELION 2.1. Cryo-EM map was converted to the mtz for- 
mat using mapmask and sfall tools in CCP4i software*!. The mtz map was then 
used for manual model building in COOT™. After initial model building, the 
state 1 and state 2 models were refined against their respective EM-derived maps 
using the phenix.real_space_refinement tool from the PHENIX software pack- 
age”, using rigid body, local grid, NCS, and gradient minimization. The models 
were then subjected to additional rounds of manual model fitting and refinement, 
resulting in good final models to map cross-correlation (Extended Data Table 1). 
Stereochemical properties of the model were evaluated by Molprobity“*. 

Protein surface area and interfaces were analysed by using PDBePISA server 

(http://www.ebi.ac.uk/pdbe/pisa/). To compare the apo, state 1 and state 2 struc- 
tures, all ligands, ions and water molecules were removed from the PDB files. 
Additional residues in the apo-5-HT3, receptor structure were also removed before 
analysis so that surface area comparisons were made between identical construct 
lengths. Electrostatic surface potential calculations were carried out using the 
APBS tools plug-in PYMOL*. The pore profile was calculated using the HOLE 
program“. All the tunnels were calculated using Caver3.0 PyMOL plug-in with 
minimal tunnel radius of 2.8 A*’. Figures were prepared using PyMOL v.2.0.4 
(Schrodinger, LLC) 
Molecular dynamics simulations. Each simulation cell (of approximate dimen- 
sions 13.5x 13.5 19.5 nm?) contains the full-length receptor structure embedded 
in a phospholipid (POPC, 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine) 
bilayer, with an aqueous solution of Na* and Cl” ions on either side. The protein- 
bilayer systems were assembled and equilibrated following a previously established 
protocol‘®. The TIP4P water model" was used. Simulations were performed with 
GROMACS version 5.15%"), using the OPLS all-atom protein force field with united- 
atom lipids®’, and at an integration time-step of 2 fs. A Verlet cut-off scheme was 
applied, and long-range electrostatic interactions were measured using the Particle 
Mesh Ewald method**. The temperature and pressure were maintained at 37°C 
and 1 bar, respectively, using the velocity-rescale thermostat? in combination with 
a semi-isotropic Parrinello and Rahman barostat*, with coupling constants of 
Tr=0.1 ps and Tp= 1 ps. Bonds were constrained through the LINCS algorithm™, 
and an additional harmonic restraint at a force constant of 1,000 kJ mol~! nm~? 
was placed on the protein backbone atoms to preserve the original conformational 
state of the cryo-EM structure. 

For water free-energy estimation, three 50 ns simulation replicates were each 
initiated from an independently assembled receptor-membrane system, containing 
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NaCl] at an approximate concentration of 0.15 M. Using the Channel Annotation 
Package (www.channotation.org), the equilibrium density of water molecules at 
successive positions along the central pore axis was measured, and free-energy 
profiles were derived through an inverse Boltzmann calculation-based method”. 
For monitoring ion permeation events, a separate set of simulations, each of 100 
ns duration and with 0.7 M NaCl included in the simulation cell, were conducted 
in the presence of a 0.2 V transmembrane potential difference. This was applied 
by imposing an external, uniform electric field across the simulation cell along the 
membrane normal direction. The field strength was of magnitude 0.05 V nm“, 
with the cytoplasmic side having either negative or positive potential in different 
simulation runs. 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 


Data availability 


Coordinates of the 5-HT3, receptor structures have been deposited at the PDB 
under accession codes 6DG7 (state 1) and 6DG8 (state 2). The cryo-EM map 
has been deposited in the Electron Microscopy Data Bank under accession code 
EMD-7882 (state 1) and EMD-7883 (state 2). All relevant data are available from 
the authors. 
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Extended Data Fig. 1 | Data processing workflow. a, A representative 
micrograph of 5-HT3, receptor incubated with 100 1M serotonin in 
vitreous ice (top). Selected 2D classes showing various orientations 
(bottom). b, A schematic of the steps followed in data processing leading 
to 3.32 A and 3.89 A reconstructions of state 1 and state 2, respectively. 
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Extended Data Fig. 2 | Estimation of resolution and validation of the map 1 (used during refinement, cyan), and refined model versus half map 
models. a, FSC curves before (red) and after (blue) the application of 2 (not used during refinement, purple) were calculated for state 1 (left) 
soft mask in RELION for state 1 (left) and state 2 (right). The dashed line and state 2 (right). c, Local resolution of state 1 (left) and state 2 (right) 
represents FSC of 0.143. b, For cross validation, FSC curves of the refined reconstructions were estimated using the ResMap program”. 
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Extended Data Fig. 3 | Map correlation of state 1 and state 2. Various 
regions of the model (shown as a cartoon) and corresponding density 
map (mesh) around the residues are shown to validate the final model. 
Residues are depicted as sticks. The depicted regions in state 1 and the 
corresponding contour levels: Cys loop (7.00), loop C (8.00), loop F 
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State 2 


‘Extracellular domain 


Transmembrane domain 


M2-M3 Linker 


MA helix 


(8.00), M2 (6.0c), M2—M3 linker (7.00), M4 (6.50), MX helix (7.5c) and 
MA helix (7.50). The depicted regions in state 2 and the corresponding 
contour levels: Cys loop (8.5c), loop C (8.0a), loop F (8.00), M2 (6.00), 
M2-M3 linker (7.00), M4 (7.0), MX helix (7.0c) and MA helix (6.0c). 
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State 1 
State 2 


Extended Data Fig. 4 | Serotonin-induced conformational changes in State 1, and particularly state 2, diverge markedly in the TMD and ICD 
the ECD and TMD. a, A global alignment of the apo structure with state (r.m.s.d. of 1.1 for state 1 and 4.24 for state 2 for residues 221-462). b, A 

1 (left) and state 2 (right). The top panel shows the ECD and the bottom side-view of the ECDs upon aligning state 1 and state 2 to the apo state. 
panel shows the TMD, both viewed from the extracellular end. The arrows __c, A top view of the TMDs when aligned with respect to the ECD. The 
indicate the direction of rotation with respect to the apo state. State 1 and arrows show relative displacements in different regions of state 1 and state 


state 2 structures superimpose with the apo ECD with a root mean square 2 with respect to the apo structure. 
deviation (r.m.s.d.) of 1.16 for state 1 and 1.41 for state 2 (residues 8-220). 
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B1-B2- loop 

B4-B5— loop (loop A) 
R6-B7 loop (Cys-loop) 
B8-B9- loop (Loop F) 
Pre-M1 

M2-M3 linker 


M1-M2 loop/M2 
Post-M3 loop 
MA 


Extended Data Fig. 5 | Inter-subunit interaction at the ECD-TMD-ICD _ polar contacts in PyMOL. Interacting residues are shown as sticks. Residue 


interface. a, Inter-subunit interactions at the ECD-TMD interface in the labels are colour-coded based on their location. The apo state has the 
apo state, state 1 and state 2. b, Inter-subunit interactions atthe TMD-ICD _ largest buried surface area (31,610 A”) which progressively decreases in 
interface in the three states. The potential interactions were predicted as state 1 (30,960 A?) and then state 2 (25,340 A’). 
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a State 2 


Extended Data Fig. 6 | The intracellular domain of state 2. a, A detailed 
view of the ICD with key residues shown in stick representation. Only two 
adjacent subunits are shown for clarity. The solvent-accessible vestibule 

in the ICD calculated using Caver3.0"’ with a minimum cavity radius of 
2.8 A is shown as dark-cyan spheres. The positively charged residues lining 
the portal are shown as blue sticks. The negatively charged residues in 

the vicinity are shown in red-brown. Residues that form the hydrophobic 
patch at the N-terminal end of the MA helix are shown as green sticks. 


Residues His309 (post-M3 loop) and Glu250 (M2) are in a potential 
interaction and are shown in magenta. b, A zoomed view of the ICD to 
highlight the break in MA-M4 helices (highlighted in magenta) at Gly430. 
Glycine-mediated transmembrane-helix distortion at the i—3 position 

is well-studied®*, and Gly430 may have a dynamic role at the hinge point 
between MA and M4 helices. A similar bend in the MA-M¢4 helix was 
previously observed in the Torpedo marmorata nAChR structure even in 
the absence of glycine at the equivalent position’. 
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Extended Data Fig. 7 | Molecular dynamics simulations of state 1. 

a, Trajectories of water and Na* ion coordinates within 5 A of the channel 
axis inside the pore over 100 ns with a 0.2-V transmembrane potential 
difference, with the cytoplasmic side having a positive potential. Stretches 
of white regions indicate areas devoid of water or ions. b, Time-averaged 
radii along the central pore axis of the state 1 structure during two 10-ns 
fractions of the simulation (within the boxed region of a) (top). The 
orange arrows indicate positions of Leu260. The dashed line indicates the 
approximate radius of a hydrated Na‘ ion. Free-energy profile of a water 
molecule along the central pore axis during the 10-ns window (bottom). 
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60 80 100 


Simulation time (ns) 
45-55 ns 


d 15-25 ns 
45-55 ns 


Pore radius (A) 
=~ NY OBA DN O OO 


om is 


The barrier at the Leu260 position disappears in the 45-55-ns time 

frame. c, Snapshots of pore conformation around Leu260 (shown in stick 
representation) during the corresponding time window. The widening of 
the pore radii and the disappearance of the barrier for water permeation 
is associated with the rotameric reorientation of the Leu260 side chain 
(indicated by the arrow). d, An overlay of the pore radii from the two time 
windows. Changes at position Leu260 are marked by the arrow. 

e, Snapshot of pore conformation around Leu260 as a Nat ion (purple) is 
passing through, with two Leu260 side chains rotated away (indicated 

by *). Three independent simulations were run. 
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Extended Data Fig. 8 | Snapshots of the state 1 pore conformation in Extended Data Fig. 7). b, The corresponding pore radius profiles. The 
from the molecular dynamics simulation. a, Side-chain orientations of positions of Glu250 and Leu260 are highlighted. 


Leu260 and Glu250 at different points during the simulation (indicated 
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Nicotine Benzamidine 


(Desensitized) (Desensitized) 


Sele 


Strychnine Glycine Glycine/ Serotonin Serotonin 


Ivermectin Apo State 1 State 2 
(Closed) (Open) (Desensitized) (Closed) (Pre-open/Desensitized ?) (Open) 
Extended Data Fig. 9 | Comparison of pLGIC pore profiles. Pore glycine- and ivermectin-bound, PDB ID: 3JAF)° and 5-HT3q receptor 
profiles calculated using the HOLE program* for the M2 region of (apo structure, PDB ID: 6BE1"’; state 1, PDB ID: 6DG7; state 2, PDB ID: 
nAChR (PDB ID: 5KXI)*, GABA, receptor 83 homopentamer (PDB ID: 6DG8). Only two M2 helices are shown as ribbon, for clarity. Pore-facing 
4COF)’, glutamate-gated chloride channel (GluCl) (apo structure, PDB residues are shown as stick representation. Green and magenta spheres 
ID: 4TNV”; ivermectin-bound, PDB ID: 3RHW™), glycine receptor define radii of 1.8-3.3 A and >3.3 A, respectively. 


(GlyR) (strychnine-bound, PDB ID: 3JAD; glycine-bound, PDB ID: 3JAE; 
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Extended Data Table 1 | Sequence of mouse 5-HT3, receptor used in the cryo-EM study and the data on cryo-EM and refinement 


a 


MWSHPQFEKGGGSGGGSGGGSWSHPOFEKGGGSGGGSGGGSWSHPOFEKGGGSGGGSGGGSWSHPOQFEKENLYF 
BI B2 


GBA TOARDTTQPALLRLSDHLLANYKKGVRPVRDWRKPTIVS i DVIMYATI SE er CR DEF 72 
B3 5 6 


B3 B4 = ===} Cys loop 


LOWTPEDEDNVTKLS IPTDSIWVPDILINEFVDVGKSPNIPYVYVHHRGEVONYKPLOLVTACSLDIYNFPFDV 146 
B7 p8 p9 B10 


Loop F Loop C 


ONES GEES WEED ERE TLRS PEEVRSDKSIFINQGEWELLEVFPQFKEFSIDISNSYAEMKFYVIIRRRP 220 
MI M2 M3 


LFYAVSLLLPSIFLMVVDIVGFCLPP DSGERVSERKITLLLGYSVELI IVSDTLPATAIGTPLIGVYFVVCMALL 294 
MX ~ 


VISLAETIFIVRLVHKQDLORPVPDWLRHLVLDRIAWI LCLGEQPMAHRPPATFQANKTDDCSGSDLLPAMGNH 368 
MA M4 


CSHVGGPQDLEKTPRGRGS PLPPPREASLAVRGLLOELSS IRHFLEKRDEMREVARDWLRVGYVLDRLLFRIYL 442 
—_______— 
SSS 5- 


LAVLAYSITLVTLWS IWHYSENLYFOGTETSQVAPA 


b State 1 State 2 
(EMDB-7882) (EMDB-7883) 
(PDB 6DG7) (PDB 6DG8) 
Data collection and 
processing 
Magnification 130,000x 
Voltage (kV) 300 
Electron exposure (e~/A’) 40 
Defocus range (um) -1.0 to -2.5 
Pixel size (A) 0.532 
Symmetry imposed C5 
Initial particle images (no.) 749,970 
Final particle images (no.) 103,698 18,839 
Map resolution (A) 3.32 3.89 
FSC threshold 0.143 0.143 
Refinement 
Initial model used (PDB 6BE1 6BE1 
code) 
Model resolution (A) 4.31 4.31 
FSC threshold 0.143 0.143 
Map sharpening B factor -50 -50 
(A) 
Model composition 
Non-hydrogen atoms 16,720 16,715 
Protein residues 16,175 16,175 
Ligands 545 540 
B factors (A?) 
Protein 154.41 244.56 
Ligand 133.45 169.20 
R.m.s. deviations 
Bond lengths (A) 0.004 0.004 
Bond angles (°) 1.032 1.031 
Validation 
MolProbity score 1.53 (94" Percentile) 1.56 (94" Percentile) 
Clashscore 3.40 (97" Percentile) 4.89 (94" Percentile) 
Poor rotamers (%) 1.27 0.55 
Ramachandran plot 
Favored (%) 95.61 95.61 
Allowed (%) 4.39 4.39 
Disallowed (% 0.00 0.00 


a, The full-length mouse 5-HT3, receptor sequence used in the cryo-EM study. Regions in the sequence highlighted in green, blue, grey and yellow represent strep-tag, linker, TEV cleavage site and 
1D4-tag, respectively. Secondary structural elements as seen in state 1 are plotted above the sequence. Loops in grey are not seen in the final refined structure. All the important loops, sheets and 
helices are labelled. Glycosylation sites are marked as blue arrows. Key residues within the serotonin-binding sites are highlighted in brown. Cysteines present in the Cys loop are shown in cyan. 
Pore-facing residues in M2 are shown in green. Arg416 in the ICD is shown in red. b, Cryo-EM data collection, refinement and validation statistics. 
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The serotonin 5-HT; receptor is a pentameric ligand-gated 
ion channel (pLGIC). It belongs to a large family of receptors 
that function as allosteric signal transducers across the plasma 
membrane!”; upon binding of neurotransmitter molecules to 
extracellular sites, the receptors undergo complex conformational 
transitions that result in transient opening of a pore permeable 
to ions. 5-HT; receptors are therapeutic targets for emesis and 
nausea, irritable bowel syndrome and depression’. In spite of 
several reported pLGIC structures**, no clear unifying view has 
emerged on the conformational transitions involved in channel 
gating. Here we report four cryo-electron microscopy structures 
of the full-length mouse 5-HT; receptor in complex with the 
anti-emetic drug tropisetron, with serotonin, and with serotonin 
and a positive allosteric modulator, at resolutions ranging from 
3.2 A to 4.5 A. The tropisetron-bound structure resembles those 
obtained with an inhibitory nanobody’ or without ligand?. The 
other structures include an ‘oper’ state and two ligand-bound 
states. We present computational insights into the dynamics of 
the structures, their pore hydration and free-energy profiles, and 
characterize movements at the gate level and cation accessibility 
in the pore. Together, these data deepen our understanding of 
the gating mechanism of pLGICs and capture ligand binding in 
unprecedented detail. 

A decade after the structure of the Torpedo marmorata nicotinic 
acetylcholine receptor!® (nAChR), the set of known pLGIC structures 
is rapidly expanding and reflects the diversity of this protein family. 
The structures share a conserved architecture, in which subunits are 
arranged around a central five-fold pseudo-symmetry axis. Together 
they have clarified details of ligand binding, selectivity and allosteric 
modulation. They have also revealed a complex landscape of confor- 
mations, raising questions of how to relate structures to the wealth 
of data that established the existence of multiple agonist-bound pre- 
active intermediate states!!"13, of distinct open states“ and of multiple 
desensitized states!>. 

Mouse homomeric 5-HT3, receptors, with their entire intracellular 
domain (ICD), were solubilized with the detergent C12E9 and puri- 
fied. We first performed cryo-electron microscopy (cryo-EM) in the 
presence of the potent antagonist tropisetron, and obtained a 4.5 A 
structure (Fig. 1b, Extended Data Figs. 1, 2, Extended Data Table 1), 
hereafter referred to as T. T is globally similar to the structure previ- 
ously solved by X-ray crystallography” (root mean square deviation 
(r.m.s.d.) of 0.6 A), the pore of which was shown by molecular dynam- 
ics to be occluded’. Tropisetron fits in a peanut-shaped density present 
in the neurotransmitter pocket (Extended Data Fig. 3d-f). The ICD 
contains a region of about 60 residues, which is averaged out (also in the 
other reconstructions, see below) because of its intrinsic flexibility’. 
T resembles the 4.5 A cryo-EM structure of the apo 5-HT; receptor? 
(r.m.s.d. of 1.15 A), with small differences in the lipid-exposed helices 
M3, MX and M4. 


We then sought to identify agonist-elicited conformations of the 
5-HT3 receptor, and performed cryo-EM imaging in the presence of 
serotonin. A first reconstruction presented heterogeneity in the mem- 
brane domain. Further focused 3D classification allowed two subsets 
of particles to be separated, which yielded reconstructions at 4.2 A and 
4.1 A resolution, corresponding to two different conformations (Fig. 1b, 
Extended Data Fig. 4). The maps offer a variable level of information: 
most side chains in the extracellular domain are resolved, whereas some 
parts of the transmembrane domain (TMD) do not have side-chain 
information and some have limited information in the main chain 
position (Extended Data Figs. 4c, 5), reflecting the receptor dynamics. 
In the two refined structures, the extracellular domains (ECDs) have 
undergone an equivalent transition from the T state and serotonin 
could be modelled in the neurotransmitter site, whereas the TMDs 
differed markedly (Fig. 2, Extended Data Fig. 6). We call the first struc- 
ture I1 for intermediate 1 and the second structure F for full, on the 
basis of the extent of movements compared to the inhibited state. I1 
exhibits only limited displacements in the upper part of M1 and M2, 
and a rearrangement of the M2-M3 loop (Supplementary Video 1). 
F is characterized by a pronounced reorganization of the transmem- 
brane helices, which can be described by a rigid-body movement of the 
four-helix bundle coupled to a rearrangement of M4 (and of M3 toa 
lesser extent) sliding along M1 and M2 (Supplementary Videos 2, 3). 
F also features a very dynamic ICD, beyond the intrinsically disor- 
dered region, in which model building was not possible even though 
the data showed incomplete densities for MX and M4 (Extended Data 
Fig. 4a-c). 

Finally, we collected a dataset in the presence of serotonin and trans- 
3-(4-methoxypheny]l)-Nn-(pentan-3-yl)acrylamide (TMPPAA, a com- 
pound exhibiting agonist and positive allosteric modulator activity on 
the human receptor!”), a combination that yields weakly desensitizing 
currents (Extended Data Fig. 1f). From this dataset, we reconstructed 
a 3.2 A resolution structure (Extended Data Fig. 2d-g), which provides 
non-ambiguous side-chain information for nearly the entire receptor. 
The refined structure has an ECD conformation essentially equivalent 
to that of 11 and E The membrane domain is similar to that of I1, albeit 
with a slightly more expanded top section and pore (Fig. 2). We call this 
structure 12 for intermediate 2. 

Serotonin can be unambiguously positioned in the neurotransmit- 
ter site of 12. It fits tightly within its binding pocket (Extended Data 
Fig. 3a—c) in an orientation consistent with functional and binding 
studies’. Surrounded by obligatory aromatic residues (F199 and Y207 
on the principal side, Y126 and W63 on the complementary side), it 
is positioned to form a cation-7 interaction with W156 and hydrogen 
bonds with the main chain of $155 and Y64. The C loop is positioned 
moderately inward relative to the inhibited conformations, its posi- 
tion locked by a salt bridge between D202 and R65. A hallmark of 
allosteric activation is the subunit-subunit rearrangement (Extended 
Data Fig. 3d), which affects the site volume and geometry. 
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Associé CNRS and University of Illinois at Urbana-Champaign, Vandoeuvre-les-Nancy, France. Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL, USA. e-mail: 
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a Tropisetron 


Serotonin 


Fig. 1 | Three-dimensional reconstructions and structures of 
homomeric 5-HT3, receptor. a, b, Reconstructions (a) and structures (b) 
for: the tropisetron dataset (protein in blue and ligand in red), the 
serotonin and Ca’+ dataset (11 in yellow, F in purple and ligand in 

green), and the serotonin and TMPPAA dataset (12 and ligand in green). 
Resolutions are shown according to the Fourier shell coefficient 0.143 
criterion. 


TMPPAA has previously been proposed to bind to an allosteric site 
in the TMD”, but there is no clearly interpretable density in our data 
to model the compound. We tested TMPPAA agonist activity on a set 
of around 45 single-point mutants of the human receptor, which col- 
lectively reveal that the drug binds between M4, M1 and M3 into an 
intra-subunit cavity skirted by lipids of the outer leaflet in the upper 
part of the TMD (Extended Data Fig. 7, Supplementary Table 1), 
where endogenous steroids bind to nAChRs'®. More generally, several 
allosteric druggable sites have been identified in pLGICs, which bind 
diverse compounds including general anaesthetics such as propofol 
and flavourings such as citral or eucalyptol?. Allosteric sites in the 
5-HT3 receptor change in both shape and volume during transitions 
(Extended Data Figs. 3i, 7a). 

At the ECD-TMD interface, a set of conserved residues that are 
essential for gating’? form a structural motif that is common to all 
pLGIC structures, the location of which may correlate with the state 
of the channel®. This motif consists of charged residues (E53, D145, 
E186 and R218) sandwiched between conserved aromatic residues: 
W187 at the top and the 142-FPF-144 motif of the Cys loop, plus Y223 
at the bottom (Extended Data Fig. 8). The FPF motif itself penetrates 
the transmembrane domain similar to a wedge; its position differs in 
each conformation. When superimposing structures on a TMD sub- 
unit, the wedge lies close to M2 in the tropisetron-bound structure and 
moves towards M4-M1 in the I1, I2 and F structures (which is possible 
because the conserved P230 allows the upper M1 to kink or straighten). 
A marked downward concerted movement of the wedge and of the 
88-89 loop (containing E186 and W187) occurs in the F structure, 
pushing on the M2-M3 loop, and may be responsible for the marked 
reorganization of the TMD observed in that state. 

pLGIC pores are lined with side chains of residues from the five 
M2 helices. In direct agreement with results from substituted cysteine 
accessibility mutagenesis’ (SCAM), our structures show that in all 
conformations, positions —1’, 2’, 6’, 9’, 13’, 16’, 17’ and 20’ of the M2 
a-helix are exposed to the pore lumen (Fig. 3a). Positions 12’ and 
15’, which are also accessible in SCAM, are partly exposed to solvent 
on the rear of the M2 helices. Superimposition of a single M2 helix 
underscores its flexibility at both ends; superimposition of the five M2 
helices highlights the crucial role of movements of the hydrophobic 
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side chain at the 9’ position (Fig. 3b, c). Minimum pore diameter is 
often a key element in the assignment of experimental structures to 
physiological states. Pore profiles are compatible with closed hydro- 
phobic gates (rings at positions 9’, 13’, 16’ and 17’) in the T and I1 
structures, with an open channel in the F structure, whereas the I2 
structure presents an intermediate profile. However, pore profiles are 
influenced by the resolution, symmetry and rotameric state of the 
side chains pointing into the pore lumen. Some positions, such as 
the key negative charge in —1’ are often poorly resolved in density 
maps, and are known to adopt alternate conformations®*”°. Moreover, 
pore profiles are not informative about hydrophobicity, wetting or 
dynamics, which have key roles in permeability”!””. We performed 
molecular dynamics simulations to better characterize permeation. In 
the microsecond-long trajectories starting from the inhibited X-ray 
structure or from I1, no water or ions cross the pore, and the hydro- 
phobic 9’, 13’ and 16’ rings establish a de-wetted hydrophobic gate 
that is tighter than in the starting structures (Fig. 3d, Extended Data 
Fig. 9). By sharp contrast, a simulation starting from F features an 
open pore that is accessible to ions and water throughout the trajec- 
tory (Supplementary Video 4). During the initial part of the simula- 
tions starting from 12, when the C, atoms are positionally restrained, 
wetting and de-wetting events of the pore occur as if its conformation 
were on the open-closed verge (Extended Data Fig. 9). Wetting is 
linked to the presence of transiently hydrated grooves at the back 
of M2 helices, down to the polar residues Y11/-S12’, affecting the 
electrostatic landscape inside the pore. Wetting also correlates with 
rotation of L9’ out of the pore lumen. Once the geometric restraints 
are removed, the 12 pore relaxes to a closed conformation similar to 
that observed in the I1 trajectory. The absence of TMPPAA in the 
simulation may rationalize the closure. Potentials of mean force for 
the translocation of a Kt ion reveal an insurmountable 12 kcal mol! 
barrier in the case of I1 (representative of closed hydrophobic gate 
conformations), and an essentially flat free-energy landscape in the 
case of F (Fig. 3e). 

We asked whether the structures could be assigned to physiological 
states. T and F are straightforward to assign, whereas I1 and 12 are less 
so. T typifies an inhibited state, with resting-like ECD stabilized by 
tropisetron, and a closed pore, resembling the apo state. F represents 
an open state, with an activated ECD with bound serotonin, an acti- 
vated TMD and a wide open pore. Two assignments are possible for 
the closely related I1 and I2 conformations (Fig. 4a). Ina first scheme, 
I1 exemplifies a serotonin-bound, pre-active closed state, in which the 
ECD and the ECD-TMD interface—but not the TMD—have under- 
gone a transition. This is consistent with single-channel analysis of 
the 5-HT; receptor, which yields kinetic models in which opening can 
occur from a ligand-bound pre-active state'*. In a second scheme, II 
represents a closed desensitized state that occurs downstream from 
the open state. In both schemes, I2 is best described as in a state close 
to I1 (that is, close to either pre-active or desensitized) wherein the 
slightly wider pore promotes wetting, which could enable ion passage 
or merely favour the switch to a fully open state, consistent with the 
TMPPAA-induced modulation. 

A distinctive feature of the second scheme is that the activation gate, 
consisting of rings of hydrophobic residues in the upper pore, would 
open in the active state and close in the desensitized state. This implies 
a marked movement of the upper pore during desensitization, and no 
ion access from the extracellular compartment to the lower pore in 
the desensitized state. Functional experiments on anionic receptors 
indicate that they have distinct activation and desensitization gates, 
the latter being located at the cytoplasmic end of the pore”’. Moreover, 
structures of the GABA4 receptor 33 homopentamer and of the 0482 
nAChR, with pores constricted at positions —2’ and —1’, have been 
assigned to desensitized states®*. These receptors, however, were engi- 
neered close to the constriction, a feature that can alter desensitization 
in 5-HT; receptors”. 

To challenge the two schemes, we measured the movement of the 
gate region and the accessibility of the pore to organic cations. First, we 
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Fig. 2 | Transitions between the tropisetron-bound, the serotonin- 
bound, and the serotonin and TMPPA-bound states. a, View parallel to 
the membrane of one subunit of the serotonin-bound conformation I1 
(yellow) overlaid with the tropisetron-bound conformation (blue, TMD 
superposition, left) or with the serotonin-bound conformation F 
(purple, ECD superposition, right). Inset, vectors indicate the local 
amplitude of movements, sampled on C, atoms (T to [1 in blue, I1 to F 
in purple). b, Overlay of the I1 (yellow) and the serotonin and TMPPAA- 
bound I2 (green) conformations (left); vector representations of the 

11 to 12 transition (ECD superposition). c, Pairwise overlays of the 


used voltage-clamp fluorometry (VCF) to probe the local conforma- 
tional changes in the upper pore at the 19’ position?>° (Extended Data 
Fig. 10). We labelled S19’C mutants, expressed at the surface of Xenopus 
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TMD illustrating transitions at the quaternary level. Structures were 
superimposed on the ECD pentamer. L9’ (L260) residues are shown as 
spheres. The line and arrow on the middle overlay indicate the region 
depicted in d and the orientation of the view, respectively. d, Tertiary 
reorganization within a subunit TMD. Overlay in ribbon representation 
of I2 and F with L260 C, as spheres (TMD superposition within a single 
subunit). Note the vertical shift of M3—-M4 relative to M1-M2, distortions 
on the extracellular halves of M1-M2, and interface re-arrangement with 
the neighbouring subunit helices (on the right of the dotted line). 


laevis oocytes, with 5-carboxytetramethylrhodamine methanethiosul- 
fonate (MTS-TAMRA). Transient stimulation with serotonin elicited 
simultaneous changes in current and fluorescence with parallel rise 
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Fig. 3 | Geometry, wetting and energy landscape of the transmembrane 
pore. a, Static pore geometry; the accessible pathway through the pore 

is represented as a solid surface for each structure. Diameters (©) of the 
constriction zones are noted. One M2 helix is depicted as cartoon with 
pore-exposed side chains as spheres (polar in green, hydrophobic in 
yellow, charged in blue). The view is parallel to the plane of the membrane. 
b, Zoom on the hydrophobic gate constriction, formed at the level of L260 
(sticks), which has a small movement backward in the I2 state and rotates 


outwards in the F state. Coloured lines indicate molecular surfaces. The 
view is perpendicular to the membrane plane. c, Superposition of one 

M2 helix in the four conformations. d, Water densities of the pore region 
during the unrestrained part of simulations. Densities are depicted as 
transparent surfaces at the same contouring level. The density for K* ions 
is also included for the F trajectory featuring an open pore. e, Potentials of 
mean force of K* ions as a function of the position along the pore axis. 
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Fig. 4 | Putative molecular mechanisms of operation. a, T (blue) 
represents an inhibited state, stabilized by tropisetron (red circle). 

11 (yellow) represents either a closed pre-active serotonin-bound (green 
circle) state or a desensitized state. F (purple) represents an open active 
state. The grey line illustrates the electrical response to serotonin recorded 
in an oocyte expressing 5-HT3, receptors. b, Recordings of MTSEA 
(1mM) modification on serotonin-evoked current in the $2’C mutant. 


and decay, whereas prolonged (7-min) exposure to serotonin resulted 
in similar signals at the onset, followed by a slow decay for the cur- 
rent signal without a change in fluorescence. Both signals returned to 
baseline when serotonin was removed. The VCF results argue against 
the second scheme, because the probe environment changes upon acti- 
vation but not during desensitization. Second, we performed SCAM 
in the resting (absence of ligand), open (during transient serotonin 
application) and desensitized (after prolonged serotonin application) 
states (Fig. 4b). MTSEA (2-aminoethyl methanethiosulfonate) and 
MTSET (2-2-(trimethylammonium)ethyl methanethiosulfonate) are 
organic cations that react with free cysteines through their methan- 
ethiosulfonate moiety. The aminoethyl head group of MTSEA is small, 
which enables it to access narrow spaces. MTSEA is, however, also 
known to cross membranes in its uncharged form. Application of 1 mM 
extracellular MTSEA in the resting state yielded no modification of 
currents in T6/C and S2’C mutants but inhibited currents (by 32%) in 
E1’C mutants (Fig. 4c). In the resting state, the compound can probably 
access position —1’ from the internal compartment, but it is unable 
to access the 2’ or 6’ positions. Similar applications in the open state 
produced irreversible inhibitions at 6’, 2’ and —1’ positions (82%, 69% 
and 58%, respectively). MTSEA can therefore reach 6’ and 2’ from the 
extracellular compartment in the open state, consistent with previous 
studies! and with the opening of the activation gate in F. Applications of 
extracellular MTSEA in the desensitized state also produced irrevers- 
ible inhibitions at 6’, 2’ and —1’ positions (30%, 63% and 50%, respec- 
tively). From these results, we infer that positions 6’ and 2’, located 
below the hydrophobic gate seen in T and I1, are accessible from the 
extracellular side in the desensitized state. No conclusions can be drawn 
for position —1’, as its labelling in the resting state precludes further 
interpretation. The results suggest that the activation gate is open in 
the desensitized conformation(s). MTSET possesses a bulkier trimeth- 
ylammonium headgroup and cannot cross membranes. We observed 
that MTSET labels cysteine mutants at 6’, 2’ and —1’ positions when 
applied in the open state (71%, 75% and 46%, respectively). Application 
of MTSET in the desensitized state yielded a small variable inhibition 
at 2’ (12 + 10%) and none at 6’, indicating that these positions are 
more accessible for MTSEA than for the bulkier MTSET. On the basis 
of these results, we favour the scheme in which I is assigned to a pre- 
active state. Nevertheless, desensitization is a complex process involving 
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MTSEA is applied in the resting (blue), the desensitized (orange), or active 
(purple) state. The protocol includes 10-s test serotonin applications 
before and after 2-min MTSEA modification. c, Changes in current after 
MTSEA (1 mM) or MTSET (1 mM) modification, for —1’C, $2’C and 
To’C mutants, in the resting (blue), active (purple) or desensitized (yellow) 
state. 


several distinct states!°, and we cannot rule out the possibility that I1 
represents a desensitized state. 

The challenge of matching structures to states without ambiguity 
transcends the present study and pertains to the whole field of pLGIC 
structures”?””®. This challenge arises from diverse factors: limited 
resolution, putative influence of detergent, crystal packing and recep- 
tor engineering, and the possibility that ensembles of multiple related 
conformations are necessary to properly depict a physiological state. 
Bearing in mind the ambiguities on state assignment, we compared 
the 5-HT; transitions to those observed for the Gloeobacter violaceus 
receptor (GLIC)*, the worm glutamate-gated receptor (GluCl)” and 
the glycine al receptor (GlyR)’. Common agonist-induced features 
emerge, such as a global twist, quaternary reorganization of the ECD, 
rearrangement of the interface between domains involving the con- 
served sandwich motif, and local movements of the upper TMD. 
Differences also appear: the extent of TMD reorganization seen in F, 
with M4 sliding on other helices, is not observed in the other recep- 
tors; the ECD reorganization is well-described as ‘un-blooming”™ for 
GLIC and GluCl, but not for the 5-HT3 and glycine receptors. The 
open pore of F is wider than that of GLIC and narrower than that 
of GlyR. Bacterial, animal anionic and animal cationic channels may 
have evolved distinct sets of conformations for a given physiological 
state, as they belong to separate branches of the pLGIC family. Our 
5-HT; receptor structures highlight several transitions in the cationic 
branch. They also contribute to knowledge on other important aspects 
of pLGIC research that are only alluded to in this report, such as the 
role of M4 in gating, the pharmacology of allosteric sites and ICD 
dynamics. Further work with better resolution, structures of mutant 
receptors and structures of receptors in complex with other ligands 
will complement and increase mechanistic insights, but knowledge of 
this area may nevertheless remain incomplete until kinetic structural 
experiments come of age. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Protein expression. The wild-type mouse 5-HT3, receptor was expressed in a 
stable, inducible cell line derived from HEK T-REx 293 cells (Thermo Fisher), as 
previously described**°". The cells were cultured in suspension in flasks in an 
orbital incubator (typical culture size, 5 1). The protein expression was induced 
when cells reached 2 x 10° cells/ml. Valproic acid was added one day later and 
cells were cultured for one more day. Cells were then pelleted by low-speed cen- 
trifugation, frozen and stored at —80°C. 

Protein purification. In a typical purification batch, 20 g of cells were resus- 
pended in buffer A (10 mM HEPES pH 7.4, 1 mM EDTA, antiprotease cocktail; 
10 ml buffer per gram of cells) mechanically lysed (Ultraturrax T20, 6 x 30 s) 
and membranes were collected by ultracentrifugation (100,000g for 1 h). All steps 
were carried out at 4 °C. Membranes were resuspended in buffer B (50 mM Tris 
pH 8, 500 mM NaCl, antiprotease cocktail, 25 ml buffer per gram of membrane) 
and the solution was supplemented with 0.15% of C12E9 for solubilisation using 
gentle stirring (1.5 h). The insoluble material was removed by ultracentrifugation 
(100,000g for 45 min). Solubilized proteins were purified by affinity chromatog- 
raphy using gravity flow Strep-Tactin resin (IBA, typically 25 ml resin), eluted in 
buffer C (50 mM Tris pH 7.5, 125 mM NaCl, 0.01% C12E9) and concentrated 
to ~0.5 mg/ml using Millipore 100-kDa cut-off filters. The purification tag was 
cleaved, and carbohydrates were digested by addition of 0.04 mg TEV protease 
and 0.1 mg PNGase F per 1 mg protein with gentle stirring overnight. The protein 
was further concentrated and then applied to a Superose 6 column (GE healthcare) 
equilibrated in buffer C. 

Electron microscopy. The most homogeneous fractions of 5-HT3 receptor fol- 
lowing size-exclusion chromatography were concentrated to ~1.5 mg/ml (in the 
best cases, no concentration was required). The sample was mixed with lipids 
(0.01% phosphatidic acid, 0.01% cholesterol hemisuccinate, 0.01% brain phos- 
phatidylcholine; Avanti Polar Lipids) and ligands: 2 mM tropisetron (Tocris), or 
50 uM serotonin and 2 mM calcium (conditions known to promote fast desensiti- 
zation***); or 30 {1M serotonin and 100 mM TMPPAA (Sigma-Aldrich). Samples 
were incubated for 10-30 min on ice. 3.5 1 were deposited on a glow-discharged 
(30 mA, 50 s) Quantifoil copper-rhodium 1.2/1.3 grid, blotted for 10 s at force 0 
using a Mark IV Vitrobot and plunge-frozen in liquid ethane. Between four and 
ten grids were screened during each data collection, as ice thickness varied between 
grids. Optimization was performed on an in-house Polara electron microscope. 
Datasets were recorded on Titan Krios electron microscopes with K2 cameras at 
C-CINA (Basel) or at the ESRF (Grenoble). Details of data collections are shown 
in Extended Data Table 1. 

Image processing. At the Basel Krios microscope, the data collection was moni- 
tored online and good images were selected using Focus™; images were acquired 
in super-resolution mode and binned by Fourier-space cropping during the drift 
correction. At the ESRF Krios microscope, the counted mode was used. Drift was 
corrected with MotionCor2* and dose-weighted sums were used for subsequent 
processing, except for CTF correction, which was performed using GCTF* on 
non-dose-weighted sums. Picking was performed with Gautomatch (http://www. 
mrc-Imb.cam.ac.uk/kzhang/Gautomatch/) using average from 2D classes as 
templates. Subsequent steps were performed using Relion®” on a GPU workstation. 
Typically, two rounds of 2D classifications with 20-30 classes were performed, 
followed by a round of 3D classification without imposing symmetry (3-6 classes) 
with a low-pass-filtered initial model of the receptor. Particles presenting five-fold 
symmetry were selected, submitted to 3D classifications (classifications and data 
processing are further described in Extended Data Figs. 2, 4) and the best sets of 
particles were subjected to 3D auto-refinement. In the post-processing step, a soft 
mask was calculated and applied to the two half maps before the Fourier shell coeffi- 
cient (FSC) was calculated. Map sharpening (B-factor fixed at —100 A? for Il and E, 
automatic estimation for T and 12) was also performed in the post-processing step. 
We tried the Phenix auto-sharpen program’, which improved only the F map. 
The quality of the final reconstructions is shown in Extended Data Fig. 5. 

Model refinement and structure analysis. Refinement was performed with the 
Phenix suite*®. Cycles of real-space refinement were performed using global min- 
imization, rigid body fit and local rotamer fitting (and B-factor refinement in late 
stages), alternating with manual rebuilding in Coot’. NCS and secondary structure 
restraints were enabled. The 4 models comprise residues 10-307 and 426-460. T, I1 
and [2 also have MX and MA residues 308-330 and 399-426, which—owing to flex- 
ibility—could not be built in E. Tropisetron (numbered 902) was placed in the ortho- 
steric side of T, and serotonin (numbered 901) was placed in the site of I1, F and I2. 
The densities for serotonin in I] and F enable the ligand to be placed in several equiv- 
alent orientations, and we used the unambiguous density in I2 to choose the same 
ligand pose in these 3 structures. The stereochemical properties of the final mod- 
els, analysed with the Molprobity server (http://molprobity.biochem.duke.edu/), 


are reported in Extended Data Table 1. Pore profiles were plotted using HOLE", 
rm.s.d. values were calculated with ‘superpose’ in the CCP4 suite”. Figures were 
prepared with the PyMOL Molecular Graphics System (Schrodinger), Chimera*? 
or CueMol. 

Molecular dynamics. Molecular assays were built for the four conformational 
states of 5-HT3 described in the manuscript. For F, Il and 12, we used the reported 
cryo-EM structures, whereas for the inhibited conformation we used the crystal 
structure (RCSB Protein Data Bank code (PDB): 4PIR) because of its higher resolu- 
tion. The co-crystallized nanobodies were removed and the protein was modelled 
without serotonin. The five serotonins present in F, 11 and 2 structures were kept 
in the models. For 12, because no obvious densities were observed for TMPPAA, it 
was not represented in the model. The intracellular domain of the F conformation 
(not resolved in the cryo-EM density) was not included in the model. 

Using the CHARMM-GUI web interface“, each structure was embedded in 
a fully hydrated palmitoyl-oleyl-phosphatidylcholine (POPC) bilayer consisting of 
around 240 lipid units and about 30,000 for F and 42,000 water molecules for T, I1 
and 12. K*Cl- (150 mM) was explicitly added to each system, while ensuring their 
electric neutrality. The all-atom CHARMM36 force field“ and revision thereof for 
lipids*” were used to describe the system and CMAP corrections were introduced for 
the protein*’, For water, we used the TIP3P model”. A subset of mass of the heavy 
atoms of lipids, protein and serotonin were transferred on the hydrogens atoms to 
which they are bound, to reach a hydrogen mass of 3.024 Da. Using such a mass 
repartitioning scheme, the equation of motions can be integrated with a time step 
of up to 4 fs without modifying the dynamics and thermodynamics of the system°?. 

All simulations were carried out with the NAMD package v.2.12°1. Simulations 
were performed in the isothermal-isobaric ensemble at T = 300K and P = 1 atm 
with anisotropic scaling of the simulation cell*’, long-range electrostatic interac- 
tions were treated with the particle-mesh Ewald method’, and short-range elec- 
trostatics and Lennard-Jones interactions were smoothly truncated. The equations 
of motion were integrated with a time step of 4 and 8 fs for short- and long-range 
forces, respectively, using the Verlet r-RESPA multiple time-step propagator”. 
Covalent bonds involving hydrogen atoms were constrained to their equilibrium 
length by means of the ‘rattle/shake algorithm’*>*° and the ‘settle’ algorithm was 
used for water”. For each system, a smooth equilibration along which the positions 
of the heavy atoms of the protein were restrained harmonically, was carried out 
for 60 ns. After releasing the restraints, the simulations were extended up to 1 ks. 
All analyses and molecular rendering were achieved with VMD™. Pore radii were 
inferred using the program HOLE”. 

The potentials of mean force underlying the translocation of a potassium ion 

in the I1 and F conformations of the 5-HT3 receptor were determined using a 
multiple-walker version®? of the adaptive biasing force algorithm. For the I1 
conformation, the reaction coordinate model was chosen as the Euclidian dis- 
tance between the ion and the centre of mass of the protein, projected onto its 
longitudinal axis; that is, the z-direction of Cartesian space. In the case of the F 
conformation, a two-dimensional free-energy landscape was generated, exploring 
ion diffusion in the pore not only longitudinally by means of the aforementioned 
projected Euclidian distance, but also radially. The potential of mean force along 
the pore was recovered by integration of the marginal law in the radial direction. 
The reaction pathway was broken down into 14 and 12 windows for the I] and 
F conformations, respectively. The free-energy landscapes were explored by four 
walkers, syncing gradients every 500 molecular-dynamics steps. The total simu- 
lation time for the I1 and F conformations amounted to 2.16 and 3.04 1s, respec- 
tively, wherein the last 0.8 1s was used to estimate the error bars associated with 
the potentials of mean force, based on an independent mapping of the free-energy 
landscape by the walkers. 
Electrophysiology. Electrical recordings were obtained by two-electrode volt- 
age-clamp (TEVC) on Xenopus oocytes expressing either wild-type or mutated 
homomeric mouse 5-HT3, receptors. Mutants were obtained using the 
QuickChange Lightning (Stratagene) site-directed mutagenesis kit and oligonu- 
cleotides (Supplementary Table 2) to introduce point mutations into the pcDNA5/ 
TO-m5-HT3, plasmid. All the mutations were verified by sequencing. 

Oocytes were prepared as previously reported®! using procedures that con- 
formed to European regulations for animal handling and experiments, and were 
approved by governmental services (authorization no. D 38 185 10 001 for the ani- 
mal facility delivered by the Prefect of Isere) and the Institutional Ethics Committee 
(ethics approval N° 12-040 granted to C. Moreau by the Ethics Committee of 
Commissariat a l'Energie Atomique et aux Energies Alternatives). Difolliculated 
oocytes were injected with 30 nl plasmid DNA (1-10 ng/l) coding for the desired 
5-HT3, subunit (subcloned into pcDNAS5 vector). Microinjected oocytes were 
incubated for 1 to 5 days at 19 °C in Barth’s solution (in mM: 1 KCl, 0.82 MgSO,, 
88 NaCl, 2.4 NaHCO3, 0.41 CaCl, 16 HEPES, pH 7.4) supplemented with 
100 U mI! penicillin and streptomycin. 

Whole-cell TEVC recordings were obtained using an OC-725C Oocyte Clamp 
amplifier (Warner Instruments) at a constant holding potential of —50 mV. 
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Macroscopic currents were filtered at 1 kHz, digitized at 2 kHz with a Digidata 1440 
analogue-to-digital interface and analysed with Clampfit (Molecular Devices). 
During the recordings, oocytes were constantly perfused with a 0 Ca-ND96 
solution (containing in mM: 91 NaCl, 2 KCl, 1 MgCl, 5 HEPES, 1 EGTA, 
pH 7.4) at a rate of 1.3 ml/min, which allowed the application of serotonin and 
other compounds in a time range of seconds (20-fold concentration change of 
applied compounds in 5 s). Throughout this study, electrophysiological responses 
were induced by 10 ,\M serotonin (a saturating dose for all tested constructions, 
not shown) and peaked in less than 3 s. In all constructions tested, 10-min appli- 
cations of serotonin induced almost complete desensitization (to less than 3% of 
the peak current with f)/2 in the min range). Full recovery of desensitization was 
observed after washing serotonin for 10 min. Ca** ions are permeant through 
5-HT; receptors™. Calcium was therefore carefully removed from the recording 
perfusion solution to avoid contamination of the serotonin-induced responses by 
endogenous oocyte current secondarily activated by Ca** ions entering the oocyte 
through the serotonin receptors. 

MTSEA (2-aminoethyl methanethiosulfonate bromide) and MTSET (2-tri- 
methylammonium-ethyl methanethiosulfonate bromide) were purchased from 
Interchim and prepared immediately before perfusion from stock solutions in 
water stored at —20 °C. The effect of MTS compounds on pore cysteine-mutants 
was studied using the following protocol: (i) checking stability of the response 
with a train of three to five applications of 10 1M serotonin for 10 s, every 5 min 
(Ipre)s (ii) applying MTS compounds for 2 min at 1 mM, either in the absence of 
5-HT (to probe cysteine accessibility in the resting state), simultaneously with 
10 |1M 5-HT (to probe the open state) or simultaneously with 10 |1M 5-HT after 
a 10 min pre-application of 5-HT alone allowing for complete desensitization; 
(iii) following the effect of MTS compounds during a second set of five applications 
of 5-HT for 10 s every 5 min (Jpost). Irreversible effects of MTS compounds were 
quantified by measuring Jost 25 min after removal of the MTS compound. The 
percentage of inhibition or potentiation was calculated as (Ipre—Ipost)/Ipre X 100. 
MTS compounds (applied at 1 mM simultaneously with 10 |.M serotonin) have 
no detectable effect on wild-type receptors (not shown). 

Voltage-clamp fluorometry. VCF recordings were performed on Xenopus oocytes 
provided by Ecocyte Bioscience that were injected with 50 ng/\il plasmid DNA 
encoding for the mouse 5-HT3, receptor $19’C mutant, after 2 to 6 days of expres- 
sion in Barth solution at 17 °C. They were labelled with MTS-TAMRA (Toronto 
Chemicals) for 5 min at 17 °C in ND96 buffer without CaCl, containing 10 1M 
5-hydroxytryptamine (5-HT), then rinsed and stored at 17 °C for up to 4h before 
recording in ND96. Recording were made in a TEVC setup as described", adapted 
for fluorescence recording. In brief, recordings were made with a VCF dedicated 
chamber with only a fraction of the oocyte perfused and illuminated by a LED 
(coolLED PE-4000). Light was collected using a fluorescence microscope (Zeiss 
Axiovert135) equipped with a 40x objective (Plan Neofluar), a TRITC filter set 
and a photomultiplier tube (Hamamatsu Photonics). Recordings were made at 
—60 mV clamp, 500-Hz sampling rate with a 550-nm excitation wavelength. Data 
were filtered, corrected for baseline and photobleaching where necessary, and ana- 
lysed using pClamp and Axograph. Dose-response curves were fitted using Prism 
to the Hill equation: I/Imax = 1/(1 + (ECs0/[5-HT])"4) in which I is the response 
at a given [5-HT] (serotonin concentration), Imax is the maximal response, ECs9 
is half maximal effective concentration and nH is the Hill coefficient. Serotonin 
dose-response relations were shifted to the left with half-responses for fluorescence 
and current at 40 and 240 nM, respectively (the ECs» of the wild-type receptor is 
800 nM). 

Fluorescence imaging plate reader membrane potential blue assay. The agonist 
properties of 5-HT and TMPPAA (Sigma-Aldrich) were characterized at human 
wild-type or mutant 5-HT3, receptors transiently expressed in tsA201 cells in 
the fluorescence imaging plate reader (FLIPR) membrane potential blue (FMP; 
fluoresence-based membrane potential) assay. The generation of some of the 
human 5-HT3, mutants have previously been described®’. Other mutants were 
constructed by introduction of point mutations into the h5-HT3A-pClIneo plas- 
mid using Quikchange II XL site-directed mutagenesis (Stratagene) and oligonu- 
cleotides (TAG Copenhagen). The absence of unwanted mutations in all cDNAs 
created by PCR was verified by sequencing (Eurofins MWG Operon). The cells 
were cultured in Dulbecco's Modified Eagle Medium supplemented with penicillin 
(100 U/ml), streptomycin (100 mg/ml) and 10% fetal bovine serum in a humidified 
atmosphere of 5% CO, and 95% air at 37 °C. Cells (1.2 x 10°) were split into a 6-cm 
tissue culture plate, transfected the following day with 4 j1g cDNA (wild-type or 
mutant h5-HT3A-pCIneo) using Polyfect (Qiagen) as transfection reagent, and 
split into poly-p-lysine-coated black 96-well plates (8 x 10* cells per well) with 
clear bottom (BD Biosciences) the following day. After 20-24 h following trans- 
fection, the medium was aspirated, and the cells were washed with 100 jl Krebs 
buffer (140 mM NaCl, 4.7 mM KCl, 2.5 mM CaCh, 1.2 mM MgCl, 11 mM HEPES, 
10 mM p-glucose, pH 7.4). Then 100 iil Krebs buffer supplemented with 0.5 mg/ml 
FMP assay dye (Molecular Devices) was added to each well, and the 96-well 
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plate was assayed at 37 °C in a FLEXStation3 Benchtop Multi-Mode Microplate 
Reader (Molecular Devices) measuring emission (in fluorescence units (FU)) 
at 565 nm caused by excitation at 530 nm before and up to 90 s after addition of 
33.3 ul assay buffer supplemented with agonist (5-HT or TMPPAA). The exper- 
iments were performed in duplicate at least three times for each agonist at all 
receptors. Concentration-response curves for the agonists were constructed based 
on the difference in the fluorescence units (rFU) between the maximal fluores- 
cence recording made before and after addition of the agonists at different con- 
centrations. The curves were generated by non-weighted least-squares fits using 
the program KaleidaGraph (Synergy Software). 

Reporting summary. Further information on research design is available in 
the Nature Research Reporting Summary linked to this paper. 
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Atomic coordinates of the four conformations have been deposited in the Protein 
Data Bank with accession numbers 6HIN, 6HIO, 6HIQ and 6HIS for conforma- 
tions F, 11, I2 and T, respectively. The cryo-EM density maps have been deposited 
in the Electron Microscopy Data Bank with accession numbers EMD-0225, EMD- 
0226, EMD-0227 and EMD-0228 for conformations F, I1, 12 and T, respectively. 
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Extended Data Fig. 1 | 5-HT; receptor topology and sequence. 

a, Schematic of the full-length 5-HT; receptor. The colour code is the 
same for all panels. Scissors indicate the enzymatic treatments used 
during purification: purification tag removal using TEV protease, and 
partial removal of carbohydrates using PNGase EF. b, Topology scheme 
and structure of one subunit of the 5-HT3 receptor. c, Sequence and 
numbering of the mouse 5-HT3, receptor used in the present study. 
Secondary structures and important loops are indicated. Neurotransmitter 
site binding loops are highlighted. Important residues discussed in 

the study are boxed (pore-facing in red, sandwich motif in brown and 
neurotransmitter site in blue). Cyan highlights residues important for 
TMPPAA potency. Glycosylation sites are depicted in green. The mouse 
receptor used for this study is a variant compared to the consensus 


5-HT+Tropisetron 


5-HT 


5-HT+TMPPAA 


sequence as it contains an alanine insertion in the M2-M3 loop, which 
is highlighted by the pink box. Notably, this insertion is present in the 
consensus human receptor sequence. This panel is adapted from ref. °. 
d, A typical unmasked reconstruction, with different density levels 
overlaid depicting the protein itself (yellow), ligands (purple), linked 
glycans (green), the detergent-lipid belt (transparent purple) and the 
footprint of the disordered zone (transparent grey). e, Currents induced 
in Xenopus oocytes expressing the mouse 5-HT3, receptor by a 3-min 
application of 10 ,.M serotonin (green traces), co-application of 10 1M 
serotonin and 100 nM tropisetron (following a 5-min pre-application of 
tropisetron, red trace), co-application of 10 1M serotonin and 10 LM of 
TMPPAA (following a 5-min pre-application of TMPPAA, black trace). 
Current traces are representative of 3 independently repeated experiments. 
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Extended Data Fig. 2 | See next page for caption. 
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Extended Data Fig. 2 | Electron microscopy and 3D reconstructions for 


the tropisetron dataset (T), and for the serotonin + TMPPAA dataset (12). 


a, Tropisetron dataset (T). Selection of 2D class averages from the set of 
particles used for refinement of the tropisetron reconstruction. 

b, Three-dimensional reconstruction from the tropisetron dataset, filtered 
and coloured according to RELION local resolution. A global and a cut- 
through side view are shown. c, FSC curves for the density map before 
and after RELION post-processing, and between the model and the final 
map. d, Serotonin + TMPPAA dataset (12). Selection of good 2D classes 
after one round of classification yields a set of 126,000 particles. One 
round of 3D classification with no symmetry imposed yields one class 
with pentameric symmetry, amounting to 62,000 particles. e, Unmasked 
unsharpened refined 3D reconstruction, filtered and coloured according 
to RELION local resolution. The colour range for resolution is similar to 
the equivalent representations for the other datasets in Extended Data 
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Fig. 4c, allowing direct comparison. f, FSC curves for the density map 
before and after RELION post-processing, and between the model and 
the final map. g, Left, a non-sharpened map filtered at 3.9 A resolution, 
obtained with a subset of 55,000 particles from the I2 dataset, selected 
after further 3D classification focused on the ICD. The hypothetical 
trajectory of the polypeptide chain after MX is bordered by a green line. 
No full connectivity can be visualized and the model was not built. The 
chain appears to descend abruptly after MX, and may interact with MA 
of the neighbouring subunit at the level of H411, where there is a clear 
density. Then it must link to the beginning of MA, but maps show no 
information how it may do so. The schematic representation highlights 
that this putative trajectory contacts MA on the neighbouring subunit 
close to residue H411. Right, the corresponding sharpened and masked 
reconstruction, showing that model building is not possible. The density of 
the disordered stretch closed to H411 is highlighted by the pink ellipses. 
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Extended Data Fig. 3 | Serotonin and tropisetron bound in the 
orthosteric site; the vestibule allosteric site. a, Organization of one of 
the five equivalent binding sites at the subunit interfaces in the ECD. 
Binding loops A-C are located on the principal subunit, and binding 
elements D-G are on the complementary subunit. The density around 
serotonin in the I2 reconstruction is shown as a transparent blue surface. 
b, c, Serotonin (ball-and-stick) in the orthosteric site in the I2 structure. 
Aromatic residues (dotted surfaces in c), hydrogen bonds with main chain 
atoms (dashed lines) and charged residues within salt-bridge distances 
(D202 on the principal side, R65 on the complementary side) are noted 

in two orthogonal views. Of note, mutants D202A“ and R65A® exhibit 
impaired serotonin binding with increases of around 140-fold and 50-fold, 
respectively, in Kj (the equilibrium inhibitor constant) during competition 
assay for [*H]-granisetron binding. d, Superimposition of tropisetron- 
bound T (blue) and serotonin-bound I2 (green) structures, highlighting 
loop C motion and quaternary reorganization with arrows (principal 
subunit superposition in this panel and in g and h; note complementary 
subunit B-strand shift). Serotonin (light green) and tropisetron (magenta) 
are represented as sticks. e, f, Tropisetron bound to the orthosteric site 

in the T structure. The bicyclic tropane moiety is sandwiched between 
W156 of loop B and Y207 of loop C, whereas the indole lies close to R65, 
W63 in loop D, D42 and 144 in loop G, and R169 in loop F. Functional 
exploration of the binding mode of tropisetron consistently showed that, 
among others, single mutations to cysteine of the aromatic residues W156, 
Y207 or W63 abolished binding. To fit tropisetron, the side chain of R65 


4 P128 


Ligands in the 
neurotransmitter site 


is pushed upward and in turn displaces the side chains of Y67 and W168, 
compared to the empty orthosteric site of the X-ray structure. Densities 
for these side chains are not well-resolved. The density for tropisetron is 
shown in Extended Data Fig. 5f. g, Superposition of the [2 structure with 
the crystal structure of a mutant acetylcholine binding protein (AChBP, 
wheat cartoon) in complex with serotonin” (yellow sticks) shows distinct 
binding modes. h, Superposition of the T structure with the crystal 
structure of AChBP (wheat cartoon) in complex with tropisetron (yellow 
sticks) shows a similar orientation of ligands but different interactions 
with the protein at loop C, D, E, F and G (not shown for clarity), consistent 
with the 3 orders of magnitude difference (0.7 versus 479 nM®*) in Ka 
(dissociation constant). i, Overlay of the T and 12 structures (in blue 

and green, respectively) superimposed on a subunit ECD, showing the 
motion of F103 out of the vestibule site. The protein depicted as ribbon 

is viewed from inside the vestibule, and the intra-subunit cavity (in 

the I2 conformation) is represented as a grey surface. Acetate in GLIC, 
flurazepam in the bacterial homologue ELIC® or a drug fragment in 
«7-AChBP” bind to this cavity. A putative rationale for the strong effect 
of F103 mutants on serotonin ECs9’!, despite its absence of participation in 
the binding site, comes from the observation of its concerted motion with 
that of strands from the neighbouring subunit (around P128 and around 
P110; for clarity, the latter is not represented here). Therefore, the motion 
of F103 may participate in subunit-subunit quaternary reorganization, 
exemplifying the allosteric coupling between sites. 


© 2018 Springer Nature Limited. All rights reserved. 


LETTER 


2D classification: select receptor-looking classes 


3D classification: select symmetric classes 


Consensus reconstruction after refinement 


‘Large’ mask 
focused on 
the lower TMD 


Unmasked 


Masked ——— 


08 
5 Model/Map 
ict 
4 0.6 
5 
2 
2 04 
a 
z 
Fy 02 
2 ———E : 
. 
L n L n f 
0 0.05 01 015 02 0.25 0.3 0.35 04 
Resolution [1/Angstroms] 
1 
Unmasked 
08 Masked 
§ Model/Map 
s 
2 ool 
5 
= | 
= 0.4 
3 
b ; | 
2 (0 
& 
As 2 3. 
LS = . a7 


Resolution [1/Angstroms] 


al 
Lower TMD slab Top of TMD slab ECD slab 
Large differences Small differences No visible difference 


Extended Data Fig. 4 | See next page for caption. 
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Extended Data Fig. 4 | Electron microscopy, classifications and 3D 
reconstructions for the serotonin dataset. a, Schematic of the data 
processing, with classical rounds of 2D and 3D classification yielding a 
consensus reconstruction seen parallel to the membrane plane (grey) 
where helices look like tubes of elliptic sections, indicating heterogeneity. 
Extensive classification trials, without alignment, with masks of various 
size covering the lower part of the TMD (shown in green and red) enable 
two conformations to be distinguished (yellow corresponding to the I1 
structure, and purple corresponding to the F structure), representing 

the extreme positions of the helices in the consensus reconstruction. An 
overlay of the two reconstructions depicts the good superimposition 


of the ECD and the clear difference in the lower TMD. Grey rectangles 
indicate the positions of the slabs represented in b. All reconstructions 
shown are unmasked outputs of RELION 3D refinements. b, Slabs, viewed 
perpendicular to the membrane plane, from the intracellular side showing 
the overlay of the I1 and the F reconstructions. c, Reconstructions filtered 
and coloured according to RELION local resolution. A global and a cut- 
through side view are shown. d, Selection of 2D class averages from the 
set of particles used for refinement of the I1 and F reconstructions. e, FSC 
curves showing the unmasked and masked FSCs (before and after post- 
processing in RELION), and the model-to-map FSC. f, Selection of ‘slice’ 
views of the TMD of the final unmasked unsharpened reconstructions. 
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Extended Data Fig. 5 | Quality of the density maps in representative d, Densities of M3 and M4. e, Densities of M2 at the level of L9’ (L260). 
regions of the four 5-HT; receptor reconstructions. Densities in mesh f, Densities around the ligands tropisetron (in T) and serotonin (in 12). 
representations overlaid with structures for the T (blue), I1 (yellow), The 3.2 A resolution of I2 permits unambiguous orientation of serotonin, 
F (purple) and 12 (green) reconstructions and structures, left to right, and side chains of residues around the ligands are seen in densities. For 
respectively, in each panel. Views a—-e are approximately the same as those _tropisetron, given the limited 4.5 A global resolution, the orientation of 
in extended data figure 4 of a previous publication of «1 GlyR’. the ligand and the positions of surrounding side chain are less certain. 
a, Representative densities of the }-sheets in the ECD. b, Densities of The views are the same as in Extended Data Fig. 3b, e. 


the Cys loop and the M2-M3 loop. ¢, Densities of helices M1 and M2. 
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Extended Data Fig. 6 | Structural superpositions with the nAChR «482, 
the GABA, receptor 83 subunit and the apo 5-HT; receptor structures. 
a, Putty representations of pairwise deviations for the 5-HT; receptor 
conformations. The selection used for superimposition, and the two 
conformations used, are noted for each image. The colour code and tube 
thickness code are the same for all images. In the T versus X-ray image, 
the red zone corresponds to a loop that was not modelled in the X-ray 
structure. b, Superimposition of a 5-HT; receptor subunit in the inhibited 
T state (blue) and of a 5-HT; receptor subunit in the apo state (grey, PDB: 
6BE1). Structures are globally similar with differences in the lipid-exposed 
helices M3, MX and M4, hypothetically a consequence of the different 
additives used—a lipid mixture in this study versus fluorinated fos-choline 
8 for the apo structure. Superposition of the T structure (blue), the apo 


Superimposition 


5KXI nAChR a4 


on the ECD 


RMSD =2.2A RMSD =2.6A RMSD = 1.8A 


4COF GABA,R £3 5KXI nAChR «4 4COF GABA,R B3 


structure (grey) and the X-ray structure (orange). c, Superposition of a 
5-HT; receptor subunit in the I1, 12 and F conformations with a nAChR 
a4 subunit (PDB: 5KXI, chain A) or a GABAg receptor 83 subunit (PDB: 
4COF). Cy r.m.s.d. is noted. Comparison of superpositions with r.m.s.d. 
below 2 A shows that in the case of F and nAChR a4 (middle, purple 
and grey), the domain-to-domain orientation is very similar and the 
extracellular halves of helices M1, M2, M3 superimpose very well, whereas 
clear deviations are present in their intracellular halves and at the level 
of M4 (r.m.s.d. without M4 drops to 1.6 A). In the case of Il and GABA, 
83 (left, yellow and black), or 12 and GABA, 83 (right, green and black), 
differences are more distributed; deviations of the ECD indicate 
different domain-to-domain orientations, and in the TMD M1 and M3 
superimpose quite well but M2 and M4 clearly differ. 
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Extended Data Fig. 7 | TMPPAA binds to an allosteric site in the TMD. 
a, Cavities in surface mode for the T (blue), I1 (yellow), F (purple) and 12 
(green) states. Note the re-arrangement of the cavities between states. The 
protein is depicted in cartoon, with sticks in the right panels for residues in 
which mutations impair the effect of TMPPAA. Noisy densities are found 
in the 12 reconstruction, in or close to the cavities, but they do not permit 
ligand modelling and could correspond to parts of a lipid or detergent 
molecule. b, Side view of the transmembrane domain of 12 with mutated 
positions depicted as green spheres (for residues for which mutations had 
negligible effect on TMPPAA potency), orange spheres (for residues for 
which mutations reduced TMPPAA potency substantially) or red spheres 
(for mutations that completely or almost completely eliminated TMPPAA 
activity at concentrations up to 100 1M). Yellow circles indicate residues 
for which different mutations produced different effects. c, Functional 
properties exhibited by serotonin and TMPPAA as agonists at 5-HT3, 
receptors expressed in tsA201 cells in an FMP assay. The human 5-HT3, 
receptor was used for these experiments, since TMPPAA evokes a more 
robust agonist response through this receptor than through mouse 


5-HT3, in this assay'’. The colour code is similar to that in b. n.d., not 
determinable, w.a., weak activity. A complete table of data for all tested 
mutant receptors is shown in Supplementary Table 1. d, Concentration- 
response curves for serotonin (closed symbols) and TMPPAA (open 
symbols) tested as agonists at wild-type, W472A and L243W human 
5-HT3, receptors expressed in tsA201 cells, using the FMP assay. Data 

are from a single representative experiment determined in the same 96- 
well plate and are given as mean based on duplicate determinations. The 
experiment was repeated independently with similar results 3 times for the 
W472A and L243W receptors and 66 times for the wild-type receptor 

(n for wild-type and all mutant receptors are given in c). e, Loss of 
TMPPAA potency at the mouse W456A mutant receptor. Currents 

evoked by 10 j1M serotonin (blue curve) or 10 1M serotonin plus 100 1M 
TMPPAA (red curve) are equivalent. The grey trace, obtained in an oocyte 
expressing wild-type receptors (response to 10 .M serotonin, normalized 
for the peak amplitude), shows that desensitization in wild-type and 
W456A mutant receptors is similar. Current traces are representative of 3 
independently repeated experiments. 
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Extended Data Fig. 8 | Structural motif and reorganization at the 
ECD-TMD interface. a, b, Key interacting charged residues R218 (pre- 
M1, key in gating’-”4), E53 (loop 81-2), E186 (loop 38-89, conserved 
only in cationic eukaryotic receptors) and D145 (Cys loop)—represented 
as purple sticks—are sandwiched between conserved essential aromatic 
residues (FPF motif of the Cys loop, W187 of 88-89 and Y223 of M1), 
represented as yellow sticks and dots. Lower in the membrane, the strictly 
conserved P230—represented as spheres—enables M1 to kink. This 
structural organization is common to all pLGICs of known structure. 
Orientations of the views are indicated on the topology scheme. c, d, Side 
and top views depicting the concerted relative positions of the Cys-loop 
FPF ‘wedge’, represented as sticks, of transmembrane helices and of the 


cys-loop 
FPF motif 


nAChR (5KXI) 


88-89 loop. Compared to the inhibited structure, the wedge moves 
towards and pushes on M1 and also moves towards M4 in I1, 12 and F. F 
presents the biggest re-organization in this zone: M1, the M2-M3 loop 
and M3 follow the motion of the FPF motif, and M2 moves away from 

the pore axis (that is, to the back in the view in c and to the top in d). 

e, Conservation of the ‘sandwich’ structural motif in representative phGIC 
structures. The aromatic top and bottom layers are represented as spheres 
(yellow, Cys loop; orange, 88-89 loop and pre-M1 residue) whereas the 
central layer of charges is depicted as sticks. Note that in anionic receptors 
and GLIC, the charge of 88-89 is absent (noted with a star). f, Sequence 
alignment of the motif residues in representative pLGIC structures. 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | Geometry of the pore during the molecular 
dynamics simulations. a, HOLE transmembrane pore profiles for 

the trajectories starting from I1, F and I2. In each graph, the profile 
from the cryo-EM structure is represented in black, the average profile 
during the restrained part of the simulation (30-60 ns, mean + s.d. of 

n = 30 snapshots taken every ns) in the lighter colour and the average 
profile during the unrestrained part (60-1,000 ns, mean + s.d. of 

n = 940 snapshots taken every ns) in the darker colour. The y-axis origin 
corresponds to L260/9’ whereas $253/2’s is located at about —12 A. 

b, Complete potentials of mean force of Kt ions as a function of 
position along the pore axis. The inset represents the 2D free-energy 
landscape as a function of the position along the pore axis and the radial 
direction, orthogonal to it. The represented free-energy profile is the 


mean of 135,000,000 values distributed along 2.16 \1s of the I] trajectory 
(190,000,000 values distributed along 3.04 1s of the F trajectory); the 
standard deviation corresponds in both cases to 50,000,000 values 

from four independent walkers, distributed along the last 0.8 1s of the 
trajectory. c, Representative snapshots of the pore wetting during the 
initial restrained part (30-60 ns) of the simulations. A dewetted pore is 
observed for the T and I1 trajectories, a fully wetted one for the 

F trajectory, whereas wetting and dewetting occur in the I2 trajectory. 
C, atoms of pore-exposed residues are represented as spheres. 

d, Snapshot of the 12 trajectory showing the wetting of grooves at the back 
of M2 (shown as molecular surface on the left, as cartoon on the right), 
concomitant with wetting of the pore. 
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Extended Data Fig. 10 | Local motions in the pore probed by VCF. 
Representative serotonin-evoked simultaneous current and fluorescence 
recordings from oocytes expressing $19’C mutant receptors labelled 
with MTS-TAMRA. Similar traces were obtained in 3 independent 
experiments. 
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Extended Data Table 1 | Statistics of 3D reconstructions and model refinement 


Data collection and processing 


Microscope 
Magnification 

Voltage (kV) 

Electron exposure (e-/A4 
Defocus range (um) 

Pixel size (A) 

Symmetry imposed 
Number of images 
Particles picked 
Symmetric-looking particles 
Particles refined 


Refinement 
Initial model used 
Resolution (A) 

FSC threshold 
Map sharpening B-factor (A4 
Model composition 


Non-hydrogen protein atoms 


Protein residues 
Ligands 
B-factor (A4 
protein 
ligand 
RM.Sdevitions 
Bond lengths (A) 
Bond angles (°) 
Validation 
Molprobity score 
Clashscore 
Poor rotamers (%) 
Ramachandran plot 
Favored (%) 
Allowed (%) 
Disallowed (%) 


Tropisetron Serotonin Serotonin and 
TM PPAA 
Krios@CCINA Krios@CCINA Krios@ ESRF 
130,000 130,000 130,000 
300 300 300 
~80 ~60 ~50 
0.5-2.5 0.8-2.5 0.8-2.5 
0.52 0.52 1.067 
63) 163) (es) 
~1500 ~2000 ~3000 
160k 456k 250k 
133k 145k 62k 
43,558 16,660 10,667 62,032 
Oe 
4PIR A4PIR 4PIR 4PIR 
4.5 4.2 41 3.2 
0.143 0.143 0.143 0.143 
-247 -100 -100 -127 
15,645 15,670 13,805 15,670 
1,935 1,925 1,675 1,925 
585 415 415 415 
127 130 83 78 
113 102 73 53 
0.008 0.009 0.009 0.006 
1.5 1.1 1.2 1.0 
2.0 1.45 1.35 1.18 
1441 8.3 6.4 3.9 
0.3 0.2 0.3 0.3 
95.1 98.0 98.5 98.4 
4.9 2.0 1.5 1.6 
0.0 0.0 0.0 0.0 
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EFSA chief scientist Marta Hugas is seeking more than 200 scientists to conduct high-quality, regulatory-driven research. 


Perks of agency science 


European Commission scientists enjoy work-life balance and dynamic career paths. 


BY QUIRIN SCHIERMEIER 


avid Krasa had never worked in one 
Dev for much more than three 

years before he became a research- 
programme officer at the European Research 
Council (ERC) in 2009. After the German sci- 
entist earned his PhD in geophysics from the 
Ludwig Maximilian University of Munich in 
Germany, he gained research experience as a 
postdoc at the University of Hawaii at Manoa 
and the University of Edinburgh, UK. For 
Krasa, seeing the world was part of the joy of 
studying rocks and minerals. 

But when he and his wife had their first child 
in 2006, he wanted something secure. “I love 
science,’ he says. “But job security and work- 
life balance become more and more important 


when you start a family.” 

Krasa had thought that research manage- 
ment, which involves organizing calls for 
proposals and coordinating administrative 
support for funded research projects, was 
a good option for permanent employment 
that would keep him in touch with science. In 
December 2008, he accepted a position as a 
research-programme officer for Earth sciences 
and solid-state physics at the ERC, which was 
set up two years earlier as the European Union's 
premier funding agency for basic research. 

Krasa now oversees the review process 
for ERC grants in the physical sciences and 
mathematics. His role includes organizing 
and moderating panel meetings of independ- 
ent reviewers and following up with principal 
investigators on the progress of their research. 


He no longer does bench work, but he interacts 
with scientists who do. Many proposals are 
far removed from Krasa’s own scientific back- 
ground, so he must quickly learn their content 
and position their ideas within a broader frame- 
work. “Tm dealing with brilliant people whom I 
might never have met otherwise,” he says. 

At the Brussels headquarters of the European 
Commission (EC), the executive arm of the 
EU, hundreds of officials administer the multi- 
billion-euro EU research programmes (includ- 
ing the ERC, which now represents almost 20% 
of the EU’s overall research budget). Scientists 
also work in the EC’s Joint Research Centre 
(JRC), a science and knowledge service based 
in Belgium, Germany, Italy, Spain and the Neth- 
erlands. The JRC generates and collates policy- 
relevant information for the EC and for > 
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> regulatory authorities in all member states. 

Fixed-term research positions and perma- 
nent jobs with a predominantly scientific pro- 
file are also available at specialized EU agencies, 
including the European Food Safety Authority 
(EFSA) in Parma, Italy, and the European Medi- 
cines Agency (EMA) in London, which will 
relocate to Amsterdam next year. 

Scientists at EU institutions and agencies 
cannot choose which topics they work on. 
But the research they are assigned might pro- 
foundly influence rules and regulations that 
affect millions of citizens. 

Officials and permanent research staff with 
the EC must have EU citizenship. Postdoctoral 
fellowships at the JRC are also open to citizens 
of 16 associated non-EU countries, including 
Switzerland, Norway, Turkey and Ukraine. 
Eligibility for traineeships is more flexible, but 
applicants from other countries must apply for 
special approval on the basis of their nationality. 

The future of UK applicants for EU jobs 
depends on pending negotiations following 
Brexit. British EU officials and temporary 
agents — including researchers — havea right 
to request an exception to the requirement of 
EU citizenship. The EC has promised to grant 
exceptions generously and transparently. 


COMPETITIVE APPLICATION 
John Magan is deputy head of the photonics unit 
in the EC’s Directorate-General for Commu- 
nications Networks, Content and Technology 
(DG CONNECT) in Brussels. A physicist by 
training, Magan joined the EC as a programme 
officer in 1993 after his former employer, the 
German chemicals company Hoechst, closed 
down its laser-research department. “I wanted 
to help build a better Europe,” he says. 
EU-funded research in photonics operates 
under heavier application pressure than it did 
25 years ago, he says. His unit now oversees an 
annual budget of €100 million (US$114 mil- 
lion) to develop laser and sensor technologies 
for medicine and industry. Programme officers 
are not experts in everything; Magan must read 
the literature and consult with independent 
experts to identify cutting-edge research top- 
ics for inclusion in the EC’s thematic work pro- 
grammes, which are redefined every two years. 
A senior programme officer might adminis- 
ter more than 12 large research collaborations at 
once, requiring almost a generalist’s knowledge, 
says Magan. His own expertise ranges from 
optical- and fibre-laser systems for industrial 
purposes to medical sensors and silicon chips 
with various applications. “You lose out on 
doing actual research,” he says. “But I like my job 
better than bench research where I might work, 
day in, day out, on just one narrow project.” 
Still, a research programme manager’s job 
is not without frustrations. “We can only fund 
about 1 in 10 to 1 in 20 proposed projects,” he 
says. “It really disappoints me that so many 
good ideas don't get funded.” 
The EC also employs some 2,000 research- 
ers in the JRC across 6 units in 5 countries (see 


TOUGH COMPETITION 


Winning the job 


The recruitment of trainees, postdocs and 
temporary research staff at the European 
Commission’s Joint Research Centre and 
EU agencies is organized by the individual 
agencies. Job opportunities for specific 
positions are generally posted online. 

The European Medicines Agency 
advertises its vacancies through its Jobs@ 
EMA portal. Job opportunities at the 
European Food Safety Authority (EFSA) are 
posted on the agency’s online recruitment 
platform. Unsolicited job applications are 
not considered. Under the European Food 
Risk Assessment Fellowship Programme, 
early- to mid-career scientists from 
national-risk-assessment authorities can 
apply for a 12-month EFSA fellowship. 

The selection process for permanent 
positions is lengthy and arduous. 
Recruitment is centrally organized through 


‘Winning the job’). Doing science in an EU 
context is quite different from academic or 
industrial settings. “When we interview job 
candidates, we make sure they understand 
where they are applying,” says physicist Elisa- 
betta Vignati, head of the JRC’s Air and Climate 
Unit in Ispra, Italy. “People who are mainly 
interested in basic research are better off at a 
university. But for researchers who are open to 
looking at science from a policy-relevant angle, 
the JRC might well be the right place” 

The JRC does not carry out blue-sky research, 
but it supplies a constant feed of scientific infor- 
mation to support EU policies — including 
energy, health and the environment — in all 
phases of implementation. Vignati’s unit, for 
example, produces models for local authorities 
to design action plans in line with EU climate 
and air-quality regulations. This involves lab 
science, such as on chemical properties and 
atmospheric fluxes of pollutants, as well as 
monitoring activities in the field. JRC research- 
ers are encouraged to publish their results in 
peer-reviewed jour- 


nals, but they are under “Here, Ican 


less pressure to publish work for the 
prolifically than their benefit 
academic peers. of many 
Theknowledgehubs thousands 
mandate means that of patients 
scientists withtheJRC jnstead of just 
institutes in Brussels; gq few. ” 


Geel, also in Belgium; 

Ispra, Italy; Karlsruhe, Germany; Petten, the 
Netherlands; and Seville, Spain, must constin- 
uously liaise with policymakers and authori- 
ties. “Our scientists must know EU legislation, 
and they must understand how policy making 
works,” says Vignati. “And they must also learn 
that talking with politicians is very different 


282 | NATURE | VOL 563 | 8 NOVEMBER 2018 


© 2018 Springer Nature Limited. All rights reserved. 


the European Personnel Selection Office 
(EPSO). Candidates should have EU 
citizenship and speak at least two official 

EU languages. To compete for EU jobs, 
candidates must create an EPSO profile and 
enter all relevant information that outlines 
their background and motivation. 

Those who are invited to the next stage 
of the competition will sit through a series 
of cognitive tests that measure their verbal, 
numerical and abstract-reasoning abilities. 
EPSO provides sample tests to help 
candidates prepare. 

Vacancies for research profiles 
are typically filled through specialist 
competitions in which candidates undergo 
further oral and written tests in their fields 
of expertise. Some EU member states offer 
training for candidates in different stages of 
the selection process. 0.5. 


from talking to scientists.” 

JRC research might have a direct impact 
on EU policies. For example, the EC in 2015 
reduced the target for the use of transport bio- 
fuels in the EU, after JRC researchers warned 
that indirect land-use changes might negate 
greenhouse-gas savings from biofuels. 

Few EC researchers work on the same 
subject for years on end. Michele Vespe, a 
migration researcher at the JRC Knowledge 
Centre on Migration and Demography in Ispra, 
developed radar remote-sensing technology for 
oil-spill detection and maritime surveillance 
before he switched to analysing big data and 
alternative data sources on migration. 

Likewise, Ispra-based chemical engineer 
Bernd Gawlik switched his research focus 
from waste and soil to wastewater treatment 
and manure management when sustainabil- 
ity became increasingly popular in the EU. “I 
know of no other place in science where you 
can work as flexibly and interdisciplinarily as 
at the JRC,’ he says. “As a chemist, you might 
collaborate with economists, social scientists 
or artificial-intelligence researchers.” 

EU-employed scientists need not worry too 
much about funding. But they are not free to 
do what they like, and they must follow strict 
internal procedures concerning workflows, 
reporting and transparency, Gawlik says. 
They are encouraged to explore their ideas — 
but before starting something new, they must 
obtain approval from management, which 
could take months. In addition, EU officials 
must weigh their words carefully, especially 
when making public claims that might con- 
tradict political mainstream thinking. And 
when conflicts arise between the EC and EU 
member states, the JRC might be asked to pro- 
duce scientific evidence under extreme time 


ADAPTED FROM JA INTER/GETTY 


pressure. “When we are asked for urgent 
advice, we work around the clock for days,” 
says Gawlik. 


AGENCY SCIENCE 
At EU regulatory agencies, scientists are 
tasked with rigorously testing potentially 
opposing claims concerning health and the 
environmental risks of drugs, chemicals and 
foodstuffs. The EMA, for example, evaluates 
applications for marketing authorizations 
of medicines and monitors the safety of 
approved drugs across their life cycles. “Our 
role is to ensure safe, effective and quality 
medicines for patients, who may need new 
treatment options,” says Pavel Balabanov, a 
Bulgarian neurologist who joined the EMA 
in 2008 after six years of clinical experi- 
ence. “I really liked working with patients. 
But here, I can work for the benefit of many 
thousands of patients instead of just a few.” 
Regulatory-driven research requires an 
interest in research methods (including 
statistics), project-management skills and 
a solid understanding of the regulatory 
framework in which the agency operates, 
says Marta Hugas, EFSA chief scientist. 
The agency provides the EC, the 
European Parliament and EU member 
states with scientific advice on health risks 
related to human and animal food. EFSA 
scientists must handle and communicate 
uncertainty and sustain an evidence-based 
position in public debate over controver- 
sial issues such as the safety of genetically 
modified crops, says Hugas. The agency 
currently employs about 200 biologists, 
chemists, toxicologists, plant researchers, 
nutrition researchers and veterinary scien- 
tists who are in steady consultation — and 
who often become coauthors of meta- 
analysis and review articles — with lead- 
ing experts in their fields. It plans to hire 
up to 100 scientists over the next few years. 
“We are looking for rigorous researchers at 
any career level who are interested in risk 
assessment for the public good,” says Hugas. 
A traineeship at an EU agency raises 
young scientists’ employability, Hugas adds. 
Chemist Alessia Amodio, now an EFSA 
trainee, wanted something new after two 
years of postdoctoral research in nano- 
technology at the University of Tor Vegata 
in Rome and the University of Melbourne 
in Australia. She enjoys the variety of tasks 
in regulatory-driven science, but hasn't yet 
decided whether she prefers ‘desk’ science 
over bench research. She hopes that her 
experience in both worlds will open doors 
to whatever career path she might choose. 
“Tve been through many challenges and 
Ive learned many new things,’ she says. “'m 
not scared at all about what might come 
next = 


Quirin Schiermeier is Nature's Germany 
correspondent in Munich. 


COLUMN 


Forge your own path 


Propose a fellowship that can propel you into your ideal 
career, say Crystal M. Botham and Tanya M. Evans. 


ooking to win a US graduate or 
[ pewecos research fellowship? Don't 

focus only on your current research: you'll 
need a proposal that outlines your specific goals 
for career development and training. 

Most US fellowships, such as the National 
Institutes of Health’s National Research Service 
Awards, support research-related and profes- 
sional activities. These might include taking 
extra courses or giving a talk that will enhance 
the award recipient’s training experience and 
improve their potential for success. But the most 
common mistake we see applicants make in our 
coaching sessions (and that we made ourselves) 
is to focus only on their research. That's just one 
component of a winning application. 

We encourage graduate students and post- 
docs to design a path that will complement 
their previous training and help to propel them 
towards their next career stage. We've developed 
an outline for incorporating training goals into 
fellowship proposals. Here are the basics: 


@ Write down what type of scientist you 
want to become. Are you aiming for an aca- 
demic career at a research-intensive institution, 
a career with a focus on teaching, or do you seea 
non-academic path in science? Which research 
area most intrigues you? What approaches and 
methods excite you? It is also helpful to list the 
publications, grants and presentations that 
could emerge from this training opportunity. 

@ Describe experiences or outcomes that 
show your potential. Emphasize the evidence 
for your high potential by noting the publica- 
tions, awards and research you have that illus- 
trate creativity or technical skills. We know 
from experience that it is easy to be discour- 
aged at this point, but your history, which 
defines who you are today, is not everything. 

e Highlight career growth and develop- 
ment areas that need attention. We have noticed 
that trainees who are able to delineate gaps in 
training, or in the experience they need to move 
on to their next career stage, are highly success- 
ful at documenting the need for and value of the 
proposed training. We recommend describing 
3-5 training goals, such as obtaining specific 
technical training, gaining laboratory manage- 
ment skills or establishing new collaborations. 

e@ Designa thorough training plan. Anchor 
this plan around your goals to address specific 
areas for growth. You can include campus-sem- 
inar series, visits to a collaborator’s lab to learn 
a technical skill, oral or poster presentations 


at scientific conferences, courses on specific 
research topics or professional-development 
skills such as management or scientific writing, 
mentoring or teaching. Throughout the pro- 
posal, you must make a compelling case that 
your future success depends on your getting this 
and career-development and research training. 
Explicitly state, for instance, that you need the 
proposed technical skill to complete one of your 
specific aims and future research goals. 


We've found that discussing specific goals is 
crucial for successful fellowship applications. 
For example, we coached a postdoc on revis- 
ing a proposal that reviewers had described as 
having a “cookie-cutter training plan” It listed 
proposed activities without linking them to the 
postdoc’s background and trajectory. 

In the revision, the applicant described 
how the plan addressed their specific training 
goals: to cultivate a certain technical skill, for 
example, the postdoc would complete specific 
coursework and work ina collaborator’ lab for 
three months. The proposal was funded. 

Remember, too, that the exercise of complet- 
ing this application is useful; even if you don't 
win the grant this time, the experience that you 
gain will make you a stronger contender for 
the next one. Perhaps even more importantly, 
you will be armed with a clear plan for reach- 
ing your career goals and research milestones. 


Crystal M. Botham directs the Stanford 
Biosciences Grant Writing Academy at 
Stanford University in California. Tanya 

M. Evans is a neuroscientist at the University 
of Virginia in Charlottesville. 
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Ua SCIENCE FICTION 


REMEMBRANCE 


BY MELANIE REES 


¢C C long time ago, war was simple,’ I tell 
the computer. 


“No war is simple,’ replies Patti. 

“Simpler, I clarify. My jittery fingers type 
the longitude, latitude and precise date. “And 
I guess the definition of ‘a long time ago is 
meaningless now.” 

“It doesn't change the job, however 
you look at it,” says Patti with mannered 
pitch and tone too nice to be real. ’'m sure 
Perpetual Analytics researched phonetics 
to ensure the time-travel interface sounded 
that benevolent. 

I transmit the weapons sequence to the 
drone. “Guildford’s Portal opening” 

“Confirmed? 

Across the other side of the laboratory, in 
its clear bubble insulated from all else, the 
drone hovers above the shimmering shards 
of energy. 

“Engaging.” I press the button and it 
vanishes from sight. 


White beaches, shrubby cliffs, tracks and 
trenches flicker onto my computer screen. 

I turn down the volume, blocking out the 
bombardment of enemy gunfire. One of 
those bullets might’ve been the one that hit 
my great-grandfather. 

Across the beaches, men and boys scram- 
ble. And fall. From the height of the drone’s 
camera, I could convince myself they're ants. 

But ants don't bleed like that. 

Beyond them, the target is clear. The 
drone zooms in on the frightened faces of 
the Central Forces. 

“I am saving tens of thousands of lives 
by killing others.” Saying it aloud helps 
convince me. 

“You are not killing,” Patti chimes, as if 
talking about the weather. “You are pressing 
a button” 

Was that scripted or Patti’s own reasoning 
breaking through? 

Sure, ’m not killing. But, this war was 
over. A fair, untainted battle. The outcome 
already decided more than a century ago. 

“My analysis suggests this moral win so 
early on could prevent the Second World 
War and the Holocaust. Potential lives saved 
are no longer just thousands on the beaches 
but millions” 

There it is; the computer's pep talk. 

I inhale and press the button. The small 
missile shoots from the drone. Fire engulfs 
the opposing troops until the drone’s images 
vanish. 


Saving time. 


I sit at my computer terminal. Confu- 
sion wracks my brain for a second. “What 
happened?” 

“The drone annihilated the enemy on 
Gallipoli” Patti projects images of my last 
mission. “You saved thousands of lives, 
including your great-grandfather’s.” 

I can't help but notice the way Patti 
emphasizes that the drone killed lives, but I 
saved lives. 

“In the original timeline your great- 
grandfather died on the beaches.” 

“Really?” Just last ANZAC Day dawn 
service, I was telling my young boy, Mickey, 
how brave my great- grandfather was as he 
ran into the enemy’s trenches, and how the 
Unknown Soldier burned them all. 

Was that me? 

“Do you fight like that? Is it scary?” my 
son had asked. 

“Not exactly. My fighting is safe.” It’s the 
best reply I can muster, but it doesn't feel 
honest somehow. 

“We just need some minor adjustments.” 
Patti brings me back to reality. “My analysis 
has selected specific individuals to target,” 
it says as if about to recite a shopping list. 
“This time we may be able to prevent the 
Holocaust.” 

Names, dates, coordinates. Data filter 
across my screen. Not data. People. My 
stomach churns. 

“I showed your medal at show and tell. 
The teacher said you're a hero,’ Mickey had 
said the other day. 

Hero? The word grinds in my guts. If1 
stopped a world war this way, how much of 
ahero would I really be? 

Irest my head in my hands, and spot a note 
with date and coordinates in my handwrit- 

ing. I set the program. 
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ofa messy laboratory stream onto my screen. 
Strewn across a desk are pages and pages of 
calculations. Alongside an old-fashioned 
desktop, a guttural snore erupts from 
underneath a tangled mess of grey hair. 

“Ts this the right time?” asks Patti. 

I recognize those calculations. “Yes. We 
are at the right place.” The drone hums and 
projects a compilation of video footage onto 
the old man’s laboratory wall. 

Sir Guildford grunts and raises his head 
a fraction. “What the heck; he mutters still 
half-dazed. 

Watching the blood and fire of men up 
close rips my heart from my chest. All those 
histories written and rewritten. All those 
men falling, who never should’ve fallen. I 
pray Sir Guildford is as affected as I am. 

He looks at his computer calculations and 
then at the advanced drone. 

“Tt is for the best,’ says my drone to the 
old man. 

“Why is the drone talking?” asks Patti. 

“T think I programmed it in another past. 
Doesn't it sound benevolent?” 

“Step aside,’ says the drone. 

I press a few buttons and the small missile 
whirrs as it aims at his work desk. 

Sir Guildford backs away from his chair, 
glances at the drone, and then runs from the 
room. 

I guess I'll never remember this, my great- 
grandfather will die young, my son may 
never call me a hero, but at least I can feel 
like one if just for this small moment in time. 

I press the ignition button and fire engulfs 
the room. 


I stand on the white beach with Mickey 
perched on my shoulders. Dawn bathes the 
crowd of people in a warm orange glow. 

“Is this where your great-grandfather 
died?” Mickey asks. 

“Yes.” 

A memory niggles in my head. The 
thought grows heavy, but the weight of it 
dissipates as the bugle breaks the silence. 

An official speaker recites the ode into the 
microphone. Silence falls across the beach. 

“,.. At the going down of the sun and in 
the morning, we will remember them,’ he 
finishes. 

That must be all I was trying to recollect. 
I will remember them. = 
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